23 55 226

Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

liked a dataset 3 days ago

MathArena/hmmt_feb_2025

liked a dataset 3 days ago

nvidia/OpenScienceReasoning-2

upvoted a paper 6 days ago

rStar2-Agent: Agentic Reasoning Technical Report

View all activity

Organizations

upvoted a paper 6 days ago

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published 7 days ago • 89

upvoted a paper 8 days ago

Hermes 4 Technical Report

Paper • 2508.18255 • Published 10 days ago • 32

upvoted 3 papers 14 days ago

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Paper • 2508.11408 • Published 20 days ago • 7

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published 15 days ago • 35

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 21 days ago • 17

upvoted 4 papers about 1 month ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 294

RAVine: Reality-Aligned Evaluation for Agentic Search

Paper • 2507.16725 • Published Jul 22 • 28

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 62

GR-3 Technical Report

Paper • 2507.15493 • Published Jul 21 • 45

upvoted a paper 2 months ago

AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training

Paper • 2507.01663 • Published Jul 2 • 5

upvoted a collection 2 months ago

Kimina Prover Preview

Collection

State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28 • 33

upvoted a paper 2 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 76

upvoted 5 papers 3 months ago

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Paper • 2506.09985 • Published Jun 11 • 30

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 263

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 70

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Paper • 2506.11928 • Published Jun 13 • 24

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9 • 90

upvoted a collection 3 months ago

MiniCPM4

Collection

MiniCPM4: Ultra-Efficient LLMs on End Devices • 22 items • Updated 28 days ago • 73

upvoted 2 papers 3 months ago

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 79

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published May 30 • 28

Yinxu Pan

AI & ML interests

Recent Activity

Organizations

cppowboy's activity