Spaces:

hasanbasbunar
/

Voxtral

Running

App Files Files Community

hasanbasbunar commited on Jul 21

Commit

96fe96c

1 Parent(s): 14d9c3a

README update

Browse files

Files changed (2) hide show

.gitignore +2 -1
README.md +97 -54

.gitignore CHANGED Viewed

@@ -89,4 +89,5 @@ test_output/
 uploads/
 # Ignore SVGs if generated
-generated_svg/

 uploads/
 # Ignore SVGs if generated
+generated_svg/
+testt.py

README.md CHANGED Viewed

@@ -10,59 +10,102 @@ pinned: false
 license: apache-2.0
 short_description: Chat and transcribe audio files with AI, powered by Voxtral.
 ---
-# Voxtral
-**Multimodal chatbot and audio transcription web app powered by Gradio and Mistral API.**
-## Features
-- Chatbot with text and audio input
-- Audio file transcription (with SRT export)
-- Modern Gradio web interface
-- API key management (secure, local to browser)
-## Demo
-![Screenshot](c29ca011-87ff-45b0-8236-08d629812732.svg)
-## Installation
-1. **Clone the repository**
-   ```bash
-   git clone <repo-url>
-   cd voxtral-gradio
-   ```
-2. **Create and activate a virtual environment**
-   ```bash
-   python3 -m venv .venv
-   source .venv/bin/activate
-   ```
-3. **Install dependencies**
-   ```bash
-   pip install -r requirements.txt
-   ```
-## Usage
-1. **Run the app**
-   ```bash
-   python app.py
-   ```
-2. Open your browser and go to [http://localhost:7860](http://localhost:7860)
-3. Enter your Mistral API key in the interface to start chatting or transcribing audio files.
-## Configuration
-- **API Key:** Your Mistral API key is required for chat and transcription features. It is stored only in your browser session and never sent to any third-party server.
-- **Environment variables:** Not required by default. For cloud deployment, you may need to set the `PORT` environment variable.
-## Deployment
-- For production, set `debug=False` in `app.py`.
-- Compatible with most Python hosting platforms (Heroku, Railway, etc.).
-- To specify a custom port:
-  ```python
-  demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False)
-  ```
-## License
-MIT
----

 license: apache-2.0
 short_description: Chat and transcribe audio files with AI, powered by Voxtral.
 ---
+# Voxtral Pro Interface
+<div align="center">
+![Python](https://img.shields.io/badge/Python-3.9+-blue?logo=python&logoColor=white)
+![Gradio](https://img.shields.io/badge/Gradio-5.37-orange?logo=gradio)
+![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)
+<a href="https://huggingface.co/spaces/hasanbasbunar/Voxtral">![Hugging Face Spaces](https://img.shields.io/badge/🤗%20Hugging%20Face-Spaces-yellow)</a>
+</div>
+<p align="center">
+  An advanced, feature-rich Gradio UI to explore the full power of Mistral AI's multimodal model, `voxtral`.
+</p>
+<p align="center">
+  <img src="image.png" alt="Voxtral Pro Demo" width="80%">
+</p>
+<p align="center">
+  <img src="image-1.png" alt="Voxtral Pro Demo" width="80%">
+</p>
+## 🚀 About The Project
+Voxtral Pro was created to explore and showcase the full range of capabilities of Mistral AI's powerful multimodal model, `voxtral`. This application goes beyond a simple chat interface to provide a comprehensive toolkit for interacting with audio and text, demonstrating features like high-quality transcription, multi-turn multimodal conversation, and agent-like tool use.
+This project serves as a practical example of how to build robust, user-friendly, and production-ready applications on top of state-of-the-art foundation models.
+## ✨ Key Features
+* **🎙️ High-Quality Transcription:** Transcribe large audio files with exceptional accuracy using the Mistral API.
+* **📄 SRT Subtitle Generation:** Automatically generate and export `.srt` subtitle files with precise segment timestamps, perfect for content creators.
+* **💬 Multimodal Chat:** Engage in rich, multi-turn conversations combining both text and audio inputs simultaneously.
+* **🤖 Tool Use / Function Calling:** Demonstrates the model's ability to call external functions to retrieve information (e.g., getting city data), showcasing its agent-like capabilities.
+* **🔐 Secure API Key Handling:** Your Mistral API key is stored securely in your browser's session storage and is never exposed or saved elsewhere.
+* **🎨 Modern UI:** A clean, responsive, and aesthetically pleasing interface built with Gradio.
+## 🛠️ Tech Stack
+This project is built with a modern, asynchronous Python stack:
+* **Backend:** [Python](https://www.python.org/)
+* **Web Framework:** [Gradio](https://www.gradio.app/)
+* **API Client:** [httpx](https://www.python-httpx.org/) with `asyncio` for non-blocking API calls.
+* **Deployment:** [Hugging Face Spaces](https://huggingface.co/spaces)
+## 🏁 Getting Started
+Follow these instructions to get a local copy up and running.
+### Prerequisites
+* Python 3.9+
+* Git
+### Installation & Configuration
+1.  **Clone the repository:**
+    git clone [https://huggingface.co/spaces/hasanbasbunar/Voxtral](https://huggingface.co/spaces/hasanbasbunar/Voxtral) && cd Voxtral
+2.  **Create and activate a virtual environment:**
+    ```sh
+    python3 -m venv .venv
+    source .venv/bin/activate
+    ```
+3.  **Install dependencies:**
+    ```sh
+    pip install -r requirements.txt
+    ```
+4.  **Configure your API Key:**
+    Create a file named `.env` in the root of the project and add your Mistral API key:
+    ```
+    MISTRAL_API_KEY="your_api_key_here"
+    ```
+    *The application is also designed to let you enter the key directly in the UI if you prefer not to use an `.env` file.*
+### Running the Application
+1.  **Launch the app:**
+    ```sh
+    python app.py
+    ```
+2.  Open your browser and navigate to `http://127.0.0.1:7860`.
+## 🚢 Deployment
+This app is designed to be easily deployed. It is currently live on [Hugging Face Spaces](https://huggingface.co/spaces/hasanbasbunar/Voxtral).
+To deploy your own version, you can use any platform that supports Python applications. For a production environment, ensure `debug=False` in `app.py`.
+Example for platforms that use a `PORT` environment variable:
+```python
+# in app.py
+demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False)