Update README.md
Browse files
README.md
CHANGED
@@ -1,51 +1,55 @@
|
|
1 |
-
F5-TTS: Fine-Tuned Arabic Speech Synthesis Model
|
2 |
-
License: CC BY-NC 4.0
|
3 |
-
Base Model: SWivid/F5-TTS
|
4 |
|
5 |
-
|
|
|
|
|
|
|
6 |
This project fine-tunes the F5-TTS model for high-quality Arabic speech synthesis, incorporating regional diversity in pronunciation and accents. The fine-tuning process is ongoing, and temporary checkpoints are provided as progress updates. Future iterations will include improved models with enhanced accuracy and naturalness.
|
7 |
|
8 |
-
License
|
9 |
-
This model is released under the CC BY-NC 4.0 license, which allows free usage, modification, and distribution for non-commercial purposes.
|
10 |
-
|
11 |
-
Datasets
|
12 |
-
Training is based on the Common Voice Arabic Dataset
|
13 |
-
|
14 |
-
Model Information
|
15 |
-
Base Model
|
16 |
-
Current Status
|
17 |
-
Training Configuration
|
18 |
-
Batch Size
|
19 |
-
Max Samples
|
20 |
-
Training Steps
|
21 |
-
(Final training parameters will be updated upon completion of fine-tuning.)
|
22 |
-
|
|
|
23 |
To use the fine-tuned Arabic model, follow these steps:
|
24 |
|
25 |
-
Method 1: Manual Model Replacement
|
26 |
-
Run the F5-TTS Application
|
27 |
-
Start the application and locate the model file path displayed in the terminal. Example:
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
Alternative Methods
|
42 |
-
GitHub Repository
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
|
|
|
|
50 |
If you have any questions or suggestions, feel free to reach out! 🚀
|
51 |
|
|
|
1 |
+
# F5-TTS: Fine-Tuned Arabic Speech Synthesis Model
|
|
|
|
|
2 |
|
3 |
+
**License:** CC BY-NC 4.0
|
4 |
+
**Base Model:** SWivid/F5-TTS
|
5 |
+
|
6 |
+
## Overview
|
7 |
This project fine-tunes the F5-TTS model for high-quality Arabic speech synthesis, incorporating regional diversity in pronunciation and accents. The fine-tuning process is ongoing, and temporary checkpoints are provided as progress updates. Future iterations will include improved models with enhanced accuracy and naturalness.
|
8 |
|
9 |
+
## License
|
10 |
+
This model is released under the **CC BY-NC 4.0** license, which allows free usage, modification, and distribution for **non-commercial** purposes.
|
11 |
+
|
12 |
+
## Datasets
|
13 |
+
Training is based on the **Common Voice Arabic Dataset**, a crowdsourced dataset featuring diverse Arabic accents and dialects. Additional datasets may be incorporated in future updates to improve dialectal coverage and pronunciation accuracy.
|
14 |
+
|
15 |
+
## Model Information
|
16 |
+
- **Base Model:** SWivid/F5-TTS
|
17 |
+
- **Current Status:** Ongoing fine-tuning (Temporary Checkpoints Available)
|
18 |
+
- **Training Configuration:**
|
19 |
+
- **Batch Size:** TBD
|
20 |
+
- **Max Samples:** TBD
|
21 |
+
- **Training Steps:** TBD
|
22 |
+
- *(Final training parameters will be updated upon completion of fine-tuning.)*
|
23 |
+
|
24 |
+
## Usage Instructions
|
25 |
To use the fine-tuned Arabic model, follow these steps:
|
26 |
|
27 |
+
### Method 1: Manual Model Replacement
|
28 |
+
1. **Run the F5-TTS Application**
|
29 |
+
- Start the application and locate the model file path displayed in the terminal. Example:
|
30 |
+
```
|
31 |
+
model : C:\Users\yourname\.cache\huggingface\hub\models--SWivid--F5-TTS\snapshots\995ff41929c08ff968786b448a384330438b5cb6\F5TTS_Base\model_1200000.safetensors
|
32 |
+
```
|
33 |
+
2. **Replace the Model File**
|
34 |
+
- Navigate to the displayed file location.
|
35 |
+
- Rename the existing model file:
|
36 |
+
```
|
37 |
+
model_1200000.safetensors → model_1200000.safetensors.bak
|
38 |
+
```
|
39 |
+
- Download the **Arabic checkpoint** and **vocabulary files** from this repository and place them in the same directory.
|
40 |
+
3. **Restart the Application**
|
41 |
+
- Relaunch the F5-TTS application to load the Arabic fine-tuned model.
|
42 |
+
|
43 |
+
### Alternative Methods
|
44 |
+
- **GitHub Repository:** Follow the [F5-TTS setup instructions](https://github.com/SWivid/F5-TTS), but replace the default model with the Arabic checkpoint and vocabulary files provided here.
|
45 |
+
|
46 |
+
## Contributions & Collaboration
|
47 |
+
This model is a **work in progress**, and community contributions are highly encouraged! Suggestions, improvements, and dataset contributions are welcome to refine its performance across different Arabic dialects.
|
48 |
+
|
49 |
+
### Recommendations for Better Results
|
50 |
+
- Use **clear reference audio** with minimal background noise.
|
51 |
+
- Ensure **balanced audio levels** for improved synthesis quality.
|
52 |
+
- Contributions in **dataset expansion** and **model evaluation** are highly valuable.
|
53 |
+
|
54 |
If you have any questions or suggestions, feel free to reach out! 🚀
|
55 |
|