2 132 63

Raja Biswas

rbiswasfc

AI & ML interests

NLP, Generative AI

Recent Activity

upvoted a paper about 16 hours ago

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

upvoted a paper about 16 hours ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

upvoted a paper about 16 hours ago

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

View all activity

Organizations

rbiswasfc's activity

upvoted 4 papers about 16 hours ago

upvoted 2 collections about 16 hours ago

OpenR1-Math

Collection

Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 2 items • Updated about 22 hours ago • 2

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 11 items • Updated about 22 hours ago • 49

upvoted a paper about 16 hours ago

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published 3 days ago • 10

upvoted an article 1 day ago

Article

Open R1: Update #2

and 6 others •

1 day ago

• 128

upvoted a paper 2 days ago

On Teacher Hacking in Language Model Distillation

Paper • 2502.02671 • Published 7 days ago • 15

upvoted an article 2 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

15 days ago

• 715

upvoted 3 papers 2 days ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 7 days ago • 49

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published 7 days ago • 44

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 7 days ago • 154

upvoted a paper 4 days ago

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 83

upvoted an article 8 days ago

Article

Open-R1: Update #1

and 7 others •

10 days ago

• 270

upvoted 3 papers 17 days ago

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published 21 days ago • 39

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published 29 days ago • 54

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 21 days ago • 316

upvoted an article 19 days ago

Article

Mastering Long Contexts in LLMs with KVPress

and 1 other •

20 days ago

• 62

upvoted a paper 20 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 28 days ago • 273