Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 108
VSSD: Vision Mamba with Non-Casual State Space Duality Paper • 2407.18559 • Published Jul 26, 2024 • 20
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17, 2024 • 63
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network Paper • 2411.15941 • Published Nov 24, 2024 • 2
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality Paper • 2411.15241 • Published Nov 22, 2024 • 7
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 535
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding Paper • 2311.08046 • Published Nov 14, 2023 • 2
MagpieLM Collection Aligning LMs with Fully Open Recipe + Synthetic Data Generated from Open-Source LMs. • 9 items • Updated Jan 13 • 16