File size: 2,870 Bytes
f83f8f2
88b71d4
f83f8f2
 
 
 
88b71d4
ff87f60
f83f8f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88b71d4
ff87f60
f83f8f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88b71d4
ff87f60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# F5-TTS: Fine-Tuned Arabic Speech Synthesis Model

**License:** CC BY-NC 4.0  
**Base Model:** SWivid/F5-TTS  

## Overview
This project fine-tunes the F5-TTS model for high-quality Arabic speech synthesis, incorporating regional diversity in pronunciation and accents. The fine-tuning process is ongoing, and temporary checkpoints are provided as progress updates. Future iterations will include improved models with enhanced accuracy and naturalness.

## License
This model is released under the **CC BY-NC 4.0** license, which allows free usage, modification, and distribution for **non-commercial** purposes.

## Datasets
Training is based on the **Common Voice Arabic Dataset**, a crowdsourced dataset featuring diverse Arabic accents and dialects. Additional datasets may be incorporated in future updates to improve dialectal coverage and pronunciation accuracy.

## Model Information
- **Base Model:** SWivid/F5-TTS  
- **Current Status:** Ongoing fine-tuning (Temporary Checkpoints Available)  
- **Training Configuration:**  
  - **Batch Size:** TBD  
  - **Max Samples:** TBD  
  - **Training Steps:** TBD  
  - *(Final training parameters will be updated upon completion of fine-tuning.)*

## Usage Instructions
To use the fine-tuned Arabic model, follow these steps:

### Method 1: Manual Model Replacement
1. **Run the F5-TTS Application**  
   - Start the application and locate the model file path displayed in the terminal. Example:  
     ```
     model : C:\Users\yourname\.cache\huggingface\hub\models--SWivid--F5-TTS\snapshots\995ff41929c08ff968786b448a384330438b5cb6\F5TTS_Base\model_1200000.safetensors
     ```
2. **Replace the Model File**  
   - Navigate to the displayed file location.  
   - Rename the existing model file:  
     ```
     model_1200000.safetensors → model_1200000.safetensors.bak
     ```
   - Download the **Arabic checkpoint** and **vocabulary files** from this repository and place them in the same directory.  
3. **Restart the Application**  
   - Relaunch the F5-TTS application to load the Arabic fine-tuned model.

### Alternative Methods
- **GitHub Repository:** Follow the [F5-TTS setup instructions](https://github.com/SWivid/F5-TTS), but replace the default model with the Arabic checkpoint and vocabulary files provided here.

## Contributions & Collaboration
This model is a **work in progress**, and community contributions are highly encouraged! Suggestions, improvements, and dataset contributions are welcome to refine its performance across different Arabic dialects.

### Recommendations for Better Results
- Use **clear reference audio** with minimal background noise.  
- Ensure **balanced audio levels** for improved synthesis quality.  
- Contributions in **dataset expansion** and **model evaluation** are highly valuable.

If you have any questions or suggestions, feel free to reach out! 🚀