File size: 887 Bytes
4533019 05f13c3 36da640 4533019 98ee0dd 4533019 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# BreezyVoice
[Playground](); [GitHub](); [Paper]()
**BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights**
BreezyVoice is a voice-cloning text-to-speech system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities via auxiliary 注音 (bopomofo) inputs. BreezyVoice is partially derived from [CosyVoice](https://github.com/FunAudioLLM/CosyVoice)
## How to Run
**Running from the GitHub instruction automatically downloads the model for you**
You can also run the model from a specified local path by cloning the model
```
git lfs install
git clone https://huggingface.co/MediaTek-Research/BreezyVoice-300M
```
then, you can use the model as specified in the run_inference.py script, providing the local model path using the model_path parameter.
If you like our work, please cite:
|