File size: 887 Bytes
4533019
 
05f13c3
 
 
 
36da640
4533019
 
98ee0dd
4533019
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# BreezyVoice

[Playground](); [GitHub](); [Paper]()

**BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights**	

BreezyVoice is a voice-cloning text-to-speech system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities via auxiliary 注音 (bopomofo) inputs. BreezyVoice is partially derived from [CosyVoice](https://github.com/FunAudioLLM/CosyVoice)


## How to Run

**Running from the GitHub instruction automatically downloads the model for you**

You can also run the model from a specified local path by cloning the model
```
git lfs install
git clone https://huggingface.co/MediaTek-Research/BreezyVoice-300M
```
then, you can use the model as specified in the run_inference.py script, providing the local model path using the model_path parameter.

If you like our work, please cite: