Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published 20 days ago • 16
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding Paper • 2406.19263 • Published Jun 27, 2024 • 10