MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 4 days ago • 59
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 7 days ago • 76
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published 13 days ago • 33
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 15 days ago • 112
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 15 days ago • 130
Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption Paper • 2503.09279 • Published 22 days ago • 5
Autoregressive Image Generation with Randomized Parallel Decoding Paper • 2503.10568 • Published 20 days ago • 8
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published 26 days ago • 34
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control Paper • 2503.03751 • Published 28 days ago • 20
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 30 days ago • 68
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published Feb 27 • 28