Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper • 2512.17532 • Published 7 days ago • 62
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 4 days ago • 60
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19 • 226
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6 • 210
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 551 items • Updated about 11 hours ago • 73
Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models Paper • 2512.01949 • Published 25 days ago • 8
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published Nov 25 • 162
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20 • 91
First Try Matters: Revisiting the Role of Reflection in Reasoning Models Paper • 2510.08308 • Published Oct 9 • 24
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 134
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1 • 36