|
F5-TTS: Fine-Tuned Arabic Speech Synthesis Model |
|
License: CC BY-NC 4.0 |
|
Base Model: SWivid/F5-TTS |
|
|
|
Overview |
|
This project fine-tunes the F5-TTS model for high-quality Arabic speech synthesis, incorporating regional diversity in pronunciation and accents. The fine-tuning process is ongoing, and temporary checkpoints are provided as progress updates. Future iterations will include improved models with enhanced accuracy and naturalness. |
|
|
|
License |
|
This model is released under the CC BY-NC 4.0 license, which allows free usage, modification, and distribution for non-commercial purposes. |
|
|
|
Datasets |
|
Training is based on the Common Voice Arabic Dataset, a crowdsourced dataset featuring diverse Arabic accents and dialects. Additional datasets may be incorporated in future updates to improve dialectal coverage and pronunciation accuracy. |
|
|
|
Model Information |
|
Base Model: SWivid/F5-TTS |
|
Current Status: Ongoing fine-tuning (Temporary Checkpoints Available) |
|
Training Configuration: |
|
Batch Size: TBD |
|
Max Samples: TBD |
|
Training Steps: TBD |
|
(Final training parameters will be updated upon completion of fine-tuning.) |
|
Usage Instructions |
|
To use the fine-tuned Arabic model, follow these steps: |
|
|
|
Method 1: Manual Model Replacement |
|
Run the F5-TTS Application |
|
Start the application and locate the model file path displayed in the terminal. Example: |
|
less |
|
Copy |
|
Edit |
|
model : C:\Users\yourname\.cache\huggingface\hub\models--SWivid--F5-TTS\snapshots\995ff41929c08ff968786b448a384330438b5cb6\F5TTS_Base\model_1200000.safetensors |
|
Replace the Model File |
|
Navigate to the displayed file location. |
|
Rename the existing model file: |
|
Copy |
|
Edit |
|
model_1200000.safetensors β model_1200000.safetensors.bak |
|
Download the Arabic checkpoint and vocabulary files from this repository and place them in the same directory. |
|
Restart the Application |
|
Relaunch the F5-TTS application to load the Arabic fine-tuned model. |
|
Alternative Methods |
|
GitHub Repository: Follow the F5-TTS setup instructions, but replace the default model with the Arabic checkpoint and vocabulary files provided here. |
|
Contributions & Collaboration |
|
This model is a work in progress, and community contributions are highly encouraged! Suggestions, improvements, and dataset contributions are welcome to refine its performance across different Arabic dialects. |
|
|
|
Recommendations for Better Results |
|
Use clear reference audio with minimal background noise. |
|
Ensure balanced audio levels for improved synthesis quality. |
|
Contributions in dataset expansion and model evaluation are highly valuable. |
|
If you have any questions or suggestions, feel free to reach out! π |
|
|
|
|