Chat with a Qwen AI assistant
An end-to-end (e2e) Voice Language Model by Fish Audio.
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Realtime implementation of Whisper large turbo
Generate image descriptions from text prompts
Detect objects in images and get bounding boxes
Vision Model
GPT 4o like bot.