FlowTok: Flowing Seamlessly Across Text and Image Tokens Paper • 2503.10772 • Published 4 days ago • 15
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 3 days ago • 90
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published 5 days ago • 10
Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling Paper • 2503.08605 • Published 7 days ago • 23
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published 8 days ago • 40
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 16 days ago • 58
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 13 days ago • 16
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 13 days ago • 16
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 13 days ago • 16 • 2
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published 14 days ago • 41
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 26 days ago • 38
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence Paper • 2502.13943 • Published 26 days ago • 7
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 30 days ago • 52
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 30