DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 108
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 109
Running 2.33k 2.33k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters