Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published 6 days ago • 37
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 4 days ago • 50
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published 1 day ago • 38
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 1 day ago • 84
OpenR1-Math Collection Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 2 items • Updated about 22 hours ago • 2
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 11 items • Updated about 22 hours ago • 49
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 7 days ago • 154
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model Paper • 2501.12368 • Published 21 days ago • 39
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 29 days ago • 54
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 21 days ago • 316
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 20 days ago • 62
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 28 days ago • 273