metadata

title: Voxtral
emoji: ⚡
colorFrom: gray
colorTo: green
sdk: gradio
sdk_version: 5.38.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Chat and transcribe audio files with AI, powered by Voxtral.

Voxtral Pro Interface

An advanced, feature-rich Gradio UI to explore the full power of Mistral AI's multimodal model, `voxtral`.

Voxtral Pro Demo

🚀 About The Project

Voxtral Pro was created to explore and showcase the full range of capabilities of Mistral AI's powerful multimodal model, voxtral. This application goes beyond a simple chat interface to provide a comprehensive toolkit for interacting with audio and text, demonstrating features like high-quality transcription, multi-turn multimodal conversation, and agent-like tool use.

This project serves as a practical example of how to build robust, user-friendly, and production-ready applications on top of state-of-the-art foundation models.

✨ Key Features

🎙️ High-Quality Transcription: Transcribe large audio files with exceptional accuracy using the Mistral API.
📄 SRT Subtitle Generation: Automatically generate and export .srt subtitle files with precise segment timestamps, perfect for content creators.
💬 Multimodal Chat: Engage in rich, multi-turn conversations combining both text and audio inputs simultaneously.
🤖 Tool Use / Function Calling: Demonstrates the model's ability to call external functions to retrieve information (e.g., getting city data), showcasing its agent-like capabilities.
🔐 Secure API Key Handling: Your Mistral API key is stored securely in your browser's session storage and is never exposed or saved elsewhere.
🎨 Modern UI: A clean, responsive, and aesthetically pleasing interface built with Gradio.

🛠️ Tech Stack

This project is built with a modern, asynchronous Python stack:

Backend: Python
Web Framework: Gradio
API Client: httpx with asyncio for non-blocking API calls.
Deployment: Hugging Face Spaces

🏁 Getting Started

Follow these instructions to get a local copy up and running.

Prerequisites

Python 3.9+
Git

Installation & Configuration

Clone the repository:

git clone https://huggingface.co/spaces/hasanbasbunar/Voxtral && cd Voxtral

Create and activate a virtual environment:

python3 -m venv .venv
source .venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Configure your API Key: Create a file named .env in the root of the project and add your Mistral API key:
```
MISTRAL_API_KEY="your_api_key_here"
```
The application is also designed to let you enter the key directly in the UI if you prefer not to use an .env file.

Running the Application

Launch the app:
```
python app.py
```
Open your browser and navigate to http://127.0.0.1:7860.

🚢 Deployment

This app is designed to be easily deployed. It is currently live on Hugging Face Spaces.

To deploy your own version, you can use any platform that supports Python applications. For a production environment, ensure debug=False in app.py.

Example for platforms that use a PORT environment variable:

# in app.py
demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False)