Spaces:
Running
Running
File size: 4,234 Bytes
3049727 96fe96c 3049727 96fe96c 8c1d8a0 96fe96c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
title: Voxtral
emoji: ⚡
colorFrom: gray
colorTo: green
sdk: gradio
sdk_version: 5.38.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Chat and transcribe audio files with AI, powered by Voxtral.
---
# Voxtral Pro Interface
<div align="center">



<a href="https://huggingface.co/spaces/hasanbasbunar/Voxtral"></a>
</div>
<p align="center">
An advanced, feature-rich Gradio UI to explore the full power of Mistral AI's multimodal model, `voxtral`.
</p>
<p align="center">
<img src="image.png" alt="Voxtral Pro Demo" width="80%">
</p>
<p align="center">
<img src="image-1.png" alt="Voxtral Pro Demo" width="80%">
</p>
## 🚀 About The Project
Voxtral Pro was created to explore and showcase the full range of capabilities of Mistral AI's powerful multimodal model, `voxtral`. This application goes beyond a simple chat interface to provide a comprehensive toolkit for interacting with audio and text, demonstrating features like high-quality transcription, multi-turn multimodal conversation, and agent-like tool use.
This project serves as a practical example of how to build robust, user-friendly, and production-ready applications on top of state-of-the-art foundation models.
## ✨ Key Features
* **🎙️ High-Quality Transcription:** Transcribe large audio files with exceptional accuracy using the Mistral API.
* **📄 SRT Subtitle Generation:** Automatically generate and export `.srt` subtitle files with precise segment timestamps, perfect for content creators.
* **💬 Multimodal Chat:** Engage in rich, multi-turn conversations combining both text and audio inputs simultaneously.
* **🤖 Tool Use / Function Calling:** Demonstrates the model's ability to call external functions to retrieve information (e.g., getting city data), showcasing its agent-like capabilities.
* **🔐 Secure API Key Handling:** Your Mistral API key is stored securely in your browser's session storage and is never exposed or saved elsewhere.
* **🎨 Modern UI:** A clean, responsive, and aesthetically pleasing interface built with Gradio.
## 🛠️ Tech Stack
This project is built with a modern, asynchronous Python stack:
* **Backend:** [Python](https://www.python.org/)
* **Web Framework:** [Gradio](https://www.gradio.app/)
* **API Client:** [httpx](https://www.python-httpx.org/) with `asyncio` for non-blocking API calls.
* **Deployment:** [Hugging Face Spaces](https://huggingface.co/spaces)
## 🏁 Getting Started
Follow these instructions to get a local copy up and running.
### Prerequisites
* Python 3.9+
* Git
### Installation & Configuration
1. **Clone the repository:**
git clone [https://huggingface.co/spaces/hasanbasbunar/Voxtral](https://huggingface.co/spaces/hasanbasbunar/Voxtral) && cd Voxtral
2. **Create and activate a virtual environment:**
```sh
python3 -m venv .venv
source .venv/bin/activate
```
3. **Install dependencies:**
```sh
pip install -r requirements.txt
```
4. **Configure your API Key:**
Create a file named `.env` in the root of the project and add your Mistral API key:
```
MISTRAL_API_KEY="your_api_key_here"
```
*The application is also designed to let you enter the key directly in the UI if you prefer not to use an `.env` file.*
### Running the Application
1. **Launch the app:**
```sh
python app.py
```
2. Open your browser and navigate to `http://127.0.0.1:7860`.
## 🚢 Deployment
This app is designed to be easily deployed. It is currently live on [Hugging Face Spaces](https://huggingface.co/spaces/hasanbasbunar/Voxtral).
To deploy your own version, you can use any platform that supports Python applications. For a production environment, ensure `debug=False` in `app.py`.
Example for platforms that use a `PORT` environment variable:
```python
# in app.py
demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False) |