Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 31 minutes ago • 299
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robots • 312 items • Updated about 11 hours ago • 47
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30, 2024 • 56
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Paper • 2408.06327 • Published Aug 12, 2024 • 17