Request for Cozyvoice 2.0, one of the best open source model

#23
by mozophe - opened

CosyVoice 2.0 was released last month. The model is open sourced (technically open weighted) along with training and fine tuning recipe.

Listen to the demos here to see why it should be added: https://funaudiollm.github.io/cosyvoice2/

I think the model fights for the top spot in the open source space.

Would love to. But this Arena does not use a TTS router. It uses other TTS spaces hosted on HuggingFace. There are issues with current spaces.

This space seems to have an endless queue. Possibly due to no limit on text length.
https://huggingface.co/spaces/FunAudioLLM/CosyVoice2-0.5B

And this space demands 300 seconds of ZeroGPU unit time. Minimum. That is the daily ZeroGPU allowance of a free HF account. I can't afford that.
https://huggingface.co/spaces/tanbw/CosyVoice

Makes sense. Thanks for sharing yours thoughts.

Due to uptick in popularity of this space. I took the pro subscription. But it seems that the 5x ZeroGPU processing time does not apply to Gradio API requests.

As seen when another TTS Space requires a 300s minimum uptime of ZeroGPU:
"Svngoku/maskgct-audio-lab:AppError('The upstream Gradio app has raised an exception: The requested GPU duration (300s) is larger than the maximum allowed')"

[edit] Actually it may only apply to Spaces with at least Gradio SDK version 5.12
https://huggingface.co/posts/StephenGenusa/720642461901666#678640fd85ccfb39daecfc93

Sign up or log in to comment