InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 8 days ago • 170
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 8 days ago • 170
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper • 2503.19757 • Published Mar 25 • 52