VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published 1 day ago • 8
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published 1 day ago • 8
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published 1 day ago • 8 • 1
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary Paper • 2503.09402 • Published 6 days ago • 6
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary Paper • 2503.09402 • Published 6 days ago • 6 • 2
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published 8 days ago • 40
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 13 days ago • 16
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published 15 days ago • 41
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 27 days ago • 38