MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 3 days ago • 141
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 2 days ago • 94
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning Paper • 2502.11573 • Published 6 days ago • 7
Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region Paper • 2502.13946 • Published 3 days ago • 9
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 7 days ago • 130
Learning Getting-Up Policies for Real-World Humanoid Robots Paper • 2502.12152 • Published 5 days ago • 35
Precise Parameter Localization for Textual Generation in Diffusion Models Paper • 2502.09935 • Published 9 days ago • 11
Diverse Inference and Verification for Advanced Reasoning Paper • 2502.09955 • Published 9 days ago • 16
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 11 days ago • 43
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model Paper • 2501.05122 • Published Jan 9 • 20
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Paper • 2501.08197 • Published Jan 14 • 8
PokerBench: Training Large Language Models to become Professional Poker Players Paper • 2501.08328 • Published Jan 14 • 17
The Geometry of Tokens in Internal Representations of Large Language Models Paper • 2501.10573 • Published Jan 17 • 9