YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

BreezyVoice

Playground; GitHub; Paper

BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights

BreezyVoice is a voice-cloning text-to-speech system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities via auxiliary 注音 (bopomofo) inputs. BreezyVoice is partially derived from CosyVoice

How to Run

Running from the GitHub instruction automatically downloads the model for you

You can also run the model from a specified local path by cloning the model

git lfs install
git clone https://huggingface.co/MediaTek-Research/BreezyVoice-300M

then, you can use the model as specified in the run_inference.py script, providing the local model path using the model_path parameter.

If you like our work, please cite:

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.