SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published 20 days ago • 81
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published 15 days ago • 44
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 20 days ago • 125
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 30 days ago • 68
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 169