Pendrokar/TTS-Spaces-Arena · Request for Cozyvoice 2.0, one of the best open source model

18 days ago

CosyVoice 2.0 was released last month. The model is open sourced (technically open weighted) along with training and fine tuning recipe.

Listen to the demos here to see why it should be added: https://funaudiollm.github.io/cosyvoice2/

I think the model fights for the top spot in the open source space.

Pendrokar

Owner 17 days ago

•

edited 15 days ago

Would love to. But this Arena does not use a TTS router. It uses other TTS spaces hosted on HuggingFace. There are issues with current spaces.

This space seems to have an endless queue. Possibly due to no limit on text length.
https://huggingface.co/spaces/FunAudioLLM/CosyVoice2-0.5B

And this space demands 300 seconds of ZeroGPU unit time. Minimum. That is the daily ZeroGPU allowance of a free HF account. I can't afford that.
https://huggingface.co/spaces/tanbw/CosyVoice

mozophe

17 days ago

Makes sense. Thanks for sharing yours thoughts.

Pendrokar

Owner 12 days ago

•

edited 12 days ago

Due to uptick in popularity of this space. I took the pro subscription. But it seems that the 5x ZeroGPU processing time does not apply to Gradio API requests.

As seen when another TTS Space requires a 300s minimum uptime of ZeroGPU:
"Svngoku/maskgct-audio-lab:AppError('The upstream Gradio app has raised an exception: The requested GPU duration (300s) is larger than the maximum allowed')"

[edit] Actually it may only apply to Spaces with at least Gradio SDK version 5.12
https://huggingface.co/posts/StephenGenusa/720642461901666#678640fd85ccfb39daecfc93