Spaces:

remiai3
/

clone_your_voice

Sleeping

App Files Files Community

remiai3 commited on 6 days ago

Commit

ad8ff68

verified ·

1 Parent(s): c7c74d8

Update README.md

Browse files

Files changed (1) hide show

README.md +41 -35

README.md CHANGED Viewed

@@ -13,56 +13,62 @@ license: apache-2.0
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
-# 🎤 Voice Cloning App (OpenVoice + Whisper)
-This Hugging Face Space lets you **clone any voice** by uploading a short sample.
-You can either:
-1. **Text → Speech Cloning**: Type any text and the app will generate speech in the cloned voice.
-2. **Audio → Audio Cloning**: Upload a content audio, and the app will convert it into the **sample speaker's voice**.
 ---
 ## 🚀 Features
-- Supports **mp3** and **wav** input (mp3 is auto-converted to wav internally).
-- Works on **CPU only** (free Spaces) – but will run slower compared to GPU.
-- Uses **OpenVoice (MyShell)** for voice cloning and **Whisper (OpenAI)** for automatic speech recognition.
 ---
-## 🛠️ How it Works
-1. Upload a **sample voice** (5–10 seconds is enough).
-2. Choose between:
-   - **Text → Speech**: Enter text → AI speaks in the sample voice.
-   - **Audio → Audio**: Upload another audio → AI transcribes it and re-generates in the sample voice.
-3. Download your cloned audio result.
----
-## ⚙️ Tech Stack
-- [OpenVoice (MyShell)](https://huggingface.co/myshell-ai/OpenVoice) – high-quality speaker timbre cloning (~80–90% similarity).
-- [Whisper-small](https://huggingface.co/openai/whisper-small) – automatic speech recognition (ASR).
-- [Gradio](https://gradio.app/) – simple and clean web UI.
-- [PyDub](https://github.com/jiaaro/pydub) – for mp3 → wav conversion.
----
-## 📂 Project Structure
-├── app.py # Main Gradio app
-├── requirements.txt # Dependencies
-└── README.md # This file
----
-## ⚡ Notes
-- CPU Spaces are slower. Expect **30–60 seconds** processing per request.
-- For faster generation, enable a **GPU Space**.
-- Works best with clean recordings (no background noise).
----
-## 🙌 Acknowledgements
-- [MyShell.ai](https://myshell.ai) for OpenVoice
-- [Coqui.ai](https://coqui.ai) for pioneering open-source TTS
-- [OpenAI Whisper](https://github.com/openai/whisper) for ASR

 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# 🎤 Voice Cloning App (XTTS-v2)
+This is a Hugging Face Space demo for **voice cloning**.
+Upload a short **sample voice recording** and enter any text — the AI will synthesize speech in the uploaded voice.
+Powered by **Coqui XTTS-v2**, running fully on CPU (works in free Spaces).
 ---
 ## 🚀 Features
+- Clone a voice with only a few seconds of reference audio.
+- Input text → get speech in the **same cloned voice**.
+- Supports both `.mp3` and `.wav` formats.
+- Runs on CPU (no GPU required).
 ---
+## 🛠 Installation
+Run locally:
+```bash
+git clone https://huggingface.co/spaces/your-username/voice-clone-app
+cd voice-clone-app
+pip install -r requirements.txt
+```
+## Requirements
+TTS==0.22.0
+torch
+pydub
+gradio
+## ▶️ Usage
+Start the Gradio app:
+`python app.py`
+Then open the browser at:
+👉 http://127.0.0.1:7860/
+## 📂 How it Works
+Upload a sample voice audio (.wav or .mp3).
+Enter the text you want spoken.
+The model clones the sample voice and generates audio output.
+## ⚠️ Notes
+Voice cloning quality depends on the length and clarity of the sample voice.
+Works best with clean recordings (5–10 seconds or more).
+CPU inference may be slower than GPU.
+## 🔮 Future Plans
+Add Audio → Audio cloning (transcribe + re-synthesize).
+Add multi-language support.
+## ✨ Built with
+Coqui TTS + Gradio