3 3 24

Ziniu Li

znli

[email protected]

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

commented on a paper about 1 month ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

liked a dataset about 2 months ago

allenai/olmo-mix-1124

View all activity

Organizations

None yet

znli's activity

upvoted a paper 17 days ago

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Paper • 2310.10505 • Published Oct 16, 2023 • 1

commented a paper about 1 month ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 108 •

liked a dataset about 2 months ago

allenai/olmo-mix-1124

Viewer • Updated Dec 2, 2024 • 99.1M • 28.8k • 36

upvoted a paper about 2 months ago

Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published Jan 10 • 70

liked a model 3 months ago

deepseek-ai/DeepSeek-V2.5-1210

Text Generation • Updated Dec 11, 2024 • 1.42k • 252

liked 2 models 4 months ago

mistralai/Ministral-8B-Instruct-2410

Updated Dec 6, 2024 • 38.3k • 444

bartowski/Mistral-Large-Instruct-2407-GGUF

Text Generation • Updated Aug 27, 2024 • 2.81k • 30

New activity in Qwen/Qwen2.5-Math-RM-72B 5 months ago

Quantized Version

#8 opened 5 months ago by

znli

liked a dataset 6 months ago

ajibawa-2023/Maths-College

Viewer • Updated May 8, 2024 • 970k • 244 • 41

commented a paper 6 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 138 •

liked a dataset 6 months ago

AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 8.23k • 421

liked 3 datasets 7 months ago

liked a model 7 months ago

deepseek-ai/deepseek-coder-7b-base-v1.5

Text Generation • Updated Feb 4, 2024 • 2.34k • 43

liked a dataset 7 months ago

WizardLMTeam/WizardLM_evol_instruct_70k

Viewer • Updated Mar 10, 2024 • 70k • 397 • 190

liked 4 datasets 8 months ago

h2oai/openassistant_oasst1

Viewer • Updated Apr 19, 2023 • 46.3k • 127 • 7

timdettmers/openassistant-guanaco

Viewer • Updated May 27, 2023 • 10.4k • 7.73k • 427

nvidia/HelpSteer2

Viewer • Updated Dec 18, 2024 • 21.4k • 3.47k • 409

pankajmathur/WizardLM_Orca

Viewer • Updated Jun 26, 2023 • 55k • 123 • 70