Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.12948

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published 18 days ago • 31
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 88
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 28 days ago • 60
apple/OpenELM

Updated May 2, 2024 • 1.43k
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • Updated 6 days ago • 526k • • 703
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published 22 days ago • 47
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published 22 days ago • 36
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302
s1: Simple test-time scaling

Paper • 2501.19393 • Published 7 days ago • 88

Llms and reasoning

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published 22 days ago • 36
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302
Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published 14 days ago • 48
RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 14 days ago • 22

about 10 hours ago

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 68
Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 106
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 24 days ago • 273
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 49

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 98
Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 29 days ago • 91
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 30 days ago • 254
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Paper • 2412.16849 • Published Dec 22, 2024 • 9
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 302

My reading list!

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published Dec 19, 2024 • 85
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345
Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 73
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 22 days ago • 67

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 37
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 45
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 35
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published Dec 19, 2024 • 85
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46
Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published Dec 23, 2024 • 30
Outcome-Refining Process Supervision for Code Generation

Paper • 2412.15118 • Published Dec 19, 2024 • 19

Previous
1
2
3
4
5
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs