Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -13,56 +13,62 @@ license: apache-2.0
|
|
13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
14 |
|
15 |
|
16 |
-
# ๐ค Voice Cloning App (
|
17 |
|
18 |
-
This Hugging Face Space
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
|
23 |
---
|
24 |
|
25 |
## ๐ Features
|
26 |
-
-
|
27 |
-
-
|
28 |
-
-
|
|
|
29 |
|
30 |
---
|
31 |
|
32 |
-
##
|
33 |
-
1. Upload a **sample voice** (5โ10 seconds is enough).
|
34 |
-
2. Choose between:
|
35 |
-
- **Text โ Speech**: Enter text โ AI speaks in the sample voice.
|
36 |
-
- **Audio โ Audio**: Upload another audio โ AI transcribes it and re-generates in the sample voice.
|
37 |
-
3. Download your cloned audio result.
|
38 |
|
39 |
-
|
40 |
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
-
|
45 |
-
|
46 |
|
47 |
-
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
##
|
50 |
-
|
51 |
-
|
52 |
-
|
|
|
53 |
|
54 |
|
55 |
-
|
|
|
|
|
|
|
56 |
|
57 |
-
## โก Notes
|
58 |
-
- CPU Spaces are slower. Expect **30โ60 seconds** processing per request.
|
59 |
-
- For faster generation, enable a **GPU Space**.
|
60 |
-
- Works best with clean recordings (no background noise).
|
61 |
|
62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
-
## ๐ Acknowledgements
|
65 |
-
- [MyShell.ai](https://myshell.ai) for OpenVoice
|
66 |
-
- [Coqui.ai](https://coqui.ai) for pioneering open-source TTS
|
67 |
-
- [OpenAI Whisper](https://github.com/openai/whisper) for ASR
|
68 |
|
|
|
|
|
|
13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
14 |
|
15 |
|
16 |
+
# ๐ค Voice Cloning App (XTTS-v2)
|
17 |
|
18 |
+
This is a Hugging Face Space demo for **voice cloning**.
|
19 |
+
Upload a short **sample voice recording** and enter any text โ the AI will synthesize speech in the uploaded voice.
|
20 |
+
|
21 |
+
Powered by **Coqui XTTS-v2**, running fully on CPU (works in free Spaces).
|
22 |
|
23 |
---
|
24 |
|
25 |
## ๐ Features
|
26 |
+
- Clone a voice with only a few seconds of reference audio.
|
27 |
+
- Input text โ get speech in the **same cloned voice**.
|
28 |
+
- Supports both `.mp3` and `.wav` formats.
|
29 |
+
- Runs on CPU (no GPU required).
|
30 |
|
31 |
---
|
32 |
|
33 |
+
## ๐ Installation
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
+
Run locally:
|
36 |
|
37 |
+
```bash
|
38 |
+
git clone https://huggingface.co/spaces/your-username/voice-clone-app
|
39 |
+
cd voice-clone-app
|
40 |
+
pip install -r requirements.txt
|
41 |
+
```
|
42 |
|
43 |
+
## Requirements
|
44 |
+
TTS==0.22.0
|
45 |
+
torch
|
46 |
+
pydub
|
47 |
+
gradio
|
48 |
|
49 |
+
## โถ๏ธ Usage
|
50 |
+
Start the Gradio app:
|
51 |
+
`python app.py`
|
52 |
+
Then open the browser at:
|
53 |
+
๐ http://127.0.0.1:7860/
|
54 |
|
55 |
|
56 |
+
## ๐ How it Works
|
57 |
+
Upload a sample voice audio (.wav or .mp3).
|
58 |
+
Enter the text you want spoken.
|
59 |
+
The model clones the sample voice and generates audio output.
|
60 |
|
|
|
|
|
|
|
|
|
61 |
|
62 |
+
## โ ๏ธ Notes
|
63 |
+
Voice cloning quality depends on the length and clarity of the sample voice.
|
64 |
+
Works best with clean recordings (5โ10 seconds or more).
|
65 |
+
CPU inference may be slower than GPU.
|
66 |
+
|
67 |
+
|
68 |
+
## ๐ฎ Future Plans
|
69 |
+
Add Audio โ Audio cloning (transcribe + re-synthesize).
|
70 |
+
Add multi-language support.
|
71 |
|
|
|
|
|
|
|
|
|
72 |
|
73 |
+
## โจ Built with
|
74 |
+
Coqui TTS + Gradio
|