GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published Jun 18 • 39
patrickjohncyh/fashion-clip Zero-Shot Image Classification • 0.2B • Updated Sep 17, 2024 • 2.89M • 235
Running on Zero MCP 165 165 DocScope-R1 📰 cosmos reason1 / docscopeocr / visionocr / captioner relaxed
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 7 days ago • 513
view article Article Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm By nvidia and 4 others • Jun 11 • 71
view article Article Introducing Training Cluster as a Service - a new collaboration with NVIDIA By jeffboudier and 2 others • Jun 11 • 24
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30, 2024 • 38
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 119
SmolVLA Collection Small, efficient and light-weight VLAs pretrained on community datasets • 1 item • Updated Jun 1 • 27
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation Paper • 2401.02117 • Published Jan 4, 2024 • 34
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 211