R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 10 days ago • 49
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • Feb 10 • 48
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published 6 days ago • 53