4 6 177

Jian Hu

chuyi777

https://hujian.website

hijkzzz

AI & ML interests

Reinforcement Learning

Recent Activity

liked a dataset 17 days ago

open-r1/OpenR1-Math-220k

liked a dataset 17 days ago

open-thoughts/OpenThoughts-114k

liked a model 17 days ago

qihoo360/TinyR1-32B-Preview

View all activity

Organizations

chuyi777's activity

liked 2 datasets 17 days ago

open-r1/OpenR1-Math-220k

Viewer • Updated 23 days ago • 450k • 53k • 491

open-thoughts/OpenThoughts-114k

Viewer • Updated 22 days ago • 228k • 86.6k • 652

liked a model 17 days ago

qihoo360/TinyR1-32B-Preview

Text Generation • Updated 4 days ago • 5.47k • 320

upvoted a paper 17 days ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 21 days ago • 45

upvoted a paper about 1 month ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

liked a model about 2 months ago

CohereForAI/c4ai-command-r7b-12-2024

Text Generation • Updated 21 days ago • 7.15k • 373

upvoted a paper 2 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 92

commented a paper 2 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 92 •

liked 2 datasets 2 months ago

AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 8.23k • 421

yingyingzhang/metamath-qwen2-math

Viewer • Updated Oct 1, 2024 • 467k • 305 • 32

upvoted a paper 3 months ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 80

updated 3 models 3 months ago

liked a model 4 months ago

O1-OPEN/OpenO1-LLama-8B-v0.1

Updated Oct 8, 2024 • 127 • 17

updated a model 4 months ago

OpenRLHF/Mistral-7b-PRM-Math-Shepherd

Updated Oct 30, 2024 • 19 • 1

New activity in OpenRLHF/Mistral-7b-PRM-Math-Shepherd 4 months ago

怎么下载模型呢？

#1 opened 5 months ago by

Yutong001

liked 3 models 5 months ago

AI-MO/NuminaMath-7B-TIR

Text Generation • Updated Aug 14, 2024 • 26.1k • 339

Nexusflow/Athene-70B

Text Generation • Updated Nov 15, 2024 • 3.05k • 196

peiyi9979/mistral-7b-sft

Text Generation • Updated Jan 15, 2024 • 648 • 7