ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement Paper • 2504.01934 • Published about 14 hours ago • 11
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published about 16 hours ago • 11
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published 1 day ago • 38
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 1 day ago • 46
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems? Paper • 2504.00509 • Published 2 days ago • 14
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published 1 day ago • 17
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 3 days ago • 30
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 3 days ago • 52
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper • 2503.21460 • Published 7 days ago • 67
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published 9 days ago • 19
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 3 days ago • 40
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 4 days ago • 70
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 3 days ago • 51
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 3 days ago • 16
SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling Paper • 2503.21732 • Published 7 days ago • 6
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published 6 days ago • 41
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model Paper • 2503.21144 • Published 7 days ago • 23