LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 78
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story Paper • 2511.15210 • Published Nov 19, 2025 • 89
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17, 2025 • 136
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success Paper • 2508.04280 • Published Aug 6, 2025 • 35
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3, 2025 • 98
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis Paper • 2506.06276 • Published Jun 6, 2025 • 26
Train Sparse Autoencoders Efficiently by Utilizing Features Correlation Paper • 2505.22255 • Published May 28, 2025 • 24
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published Feb 5, 2025 • 60
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3, 2025 • 113
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 138