A text-to-speech model powered by SparkAudio and Mobvoi.
Extract and visualize knowledge graphs from any text
Ranking of LLMs for agentic tasks
Interact with AI using text, images, or audio
Generate Podcast using Kokoro-TTS!
Generate text and audio responses from images and videos
Magma-8B model for UI Agents
Large Language Diffusion Models
OmniParser, turn your LLM into GUI agent
Gradio demo for MatAnyone
Demo for audiobox-aesthetics
Speech Synthesis with Zonos