Pre-training Auto-regressive Robotic Models with 4D Representations Paper • 2502.13142 • Published 24 days ago • 4
Pre-training Auto-regressive Robotic Models with 4D Representations Paper • 2502.13142 • Published 24 days ago • 4 • 2
LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning Paper • 2406.11815 • Published Jun 17, 2024 • 1
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Paper • 2406.15334 • Published Jun 21, 2024 • 9
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Paper • 2406.15334 • Published Jun 21, 2024 • 9 • 1
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models Paper • 2305.19595 • Published May 31, 2023
Teaching Structured Vision&Language Concepts to Vision&Language Models Paper • 2211.11733 • Published Nov 21, 2022
FETA: Towards Specializing Foundation Models for Expert Task Applications Paper • 2209.03648 • Published Sep 8, 2022
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs Paper • 2406.08164 • Published Jun 12, 2024