"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries Paper • 2508.15752 • Published 12 days ago • 7
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 410 items • Updated about 4 hours ago • 62
An Illusion of Progress? Assessing the Current State of Web Agents Paper • 2504.01382 • Published Apr 2 • 4
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated 4 days ago • 80
MolmoAct Data Mixture Collection All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. • 4 items • Updated 7 days ago • 12
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency Paper • 2508.05615 • Published 26 days ago • 21
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published 26 days ago • 121
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal Paper • 2508.05988 • Published 25 days ago • 19
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 25 days ago • 169
Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models Paper • 2507.12566 • Published Jul 16 • 14
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 178
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 643
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling Paper • 2507.07982 • Published Jul 10 • 32
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10 • 47
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights Paper • 2506.02865 • Published Jun 3 • 31
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • Jun 21 • 68
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 84