File size: 887 Bytes

# BreezyVoice

[Playground](); [GitHub](); [Paper]()

**BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights**	

BreezyVoice is a voice-cloning text-to-speech system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities via auxiliary 注音 (bopomofo) inputs. BreezyVoice is partially derived from [CosyVoice](https://github.com/FunAudioLLM/CosyVoice)


## How to Run

**Running from the GitHub instruction automatically downloads the model for you**

You can also run the model from a specified local path by cloning the model
```
git lfs install
git clone https://huggingface.co/MediaTek-Research/BreezyVoice-300M
```
then, you can use the model as specified in the run_inference.py script, providing the local model path using the model_path parameter.

If you like our work, please cite: