OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper • 2510.10689 • Published Oct 12 • 46
V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs Paper • 2509.25773 • Published Sep 30
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception Paper • 2510.12720 • Published Oct 14 • 1
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19 • 27
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19 • 27
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29 • 17
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29 • 17
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29 • 17 • 2