Qwen/Qwen3-Coder-30B-A3B-Instruct Text Generation • 31B • Updated 14 days ago • 311k • • 545
Qwen/Qwen3-Coder-480B-A35B-Instruct Text Generation • 480B • Updated 14 days ago • 158k • • 1.16k
Scaling Reasoning can Improve Factuality in Large Language Models Paper • 2505.11140 • Published May 16 • 7
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 121
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 69