--- datasets: - mozilla-foundation/common_voice_17_0 language: - ar base_model: - SWivid/F5-TTS pipeline_tag: text-to-speech tags: - speech - f5-tts - arabic --- # F5-TTS: Fine-Tuned Arabic Speech Synthesis Model ## Overview This project fine-tunes the F5-TTS model for high-quality Arabic speech synthesis, incorporating regional diversity in pronunciation and accents. The fine-tuning process is ongoing, and temporary checkpoints are provided as progress updates. Future iterations will include improved models with enhanced accuracy and naturalness. ## License This model is released under the **CC BY-NC 4.0** license, which allows free usage, modification, and distribution for **non-commercial** purposes. ## Datasets Training is based on the **Common Voice Arabic Dataset** so basically the model support MSA ## Model Information - **Base Model:** SWivid/F5-TTS - **Current Status:** Ongoing fine-tuning (Temporary Checkpoints Available) - *(Final training parameters will be updated upon completion of fine-tuning.)* ## Usage Instructions To use the fine-tuned Arabic model, follow these steps: ### Method 1: Manual Model Replacement 1. **Run the F5-TTS Application** - Start the application and locate the model file path displayed in the terminal. Example: ``` model : C:\Users\yourname\.cache\huggingface\hub\models--SWivid--F5-TTS\snapshots\995ff41929c08ff968786b448a384330438b5cb6\F5TTS_Base\model_1200000.safetensors ``` 2. **Replace the Model File** - Download the **Arabic checkpoint** and **vocabulary files** from this repository and place them in the same directory. 3. **Restart the Application** - Relaunch the F5-TTS application to load the Arabic fine-tuned model. - Download the **Arabic checkpoint** and **vocabulary files** from this repository and use them instead of the basecheckpoint. ### Alternative Methods - **GitHub Repository:** Follow the [F5-TTS setup instructions](https://github.com/SWivid/F5-TTS), but replace the default model with the Arabic checkpoint and vocabulary files provided here. ## Contributions & Collaboration This model is a **work in progress**, and community contributions are highly encouraged! Suggestions, improvements, and dataset contributions are welcome to refine its performance across different Arabic dialects. ### Recommendations for Better Results - Use **clear reference audio** with minimal background noise. - Ensure **balanced audio levels** for improved synthesis quality. - Contributions in **dataset expansion** and **model evaluation** are highly valuable. If you have any questions or suggestions, feel free to reach out! 🚀