Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
8
Wenkai Yang
Keven16
Follow
LloydAndersen's profile picture
VanTricht's profile picture
wyzjack's profile picture
5 followers
·
3 following
https://keven980716.github.io/
keven980716
AI & ML interests
None yet
Recent Activity
upvoted
an
article
5 days ago
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
commented
on
a paper
8 days ago
ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
published
a model
about 1 month ago
Keven16/Qwen2.5-32B-TOPS-Iter-DPO-Preview
View all activity
Organizations
None yet
Papers
10
arxiv:
2505.00662
arxiv:
2502.18080
arxiv:
2406.11431
arxiv:
2404.02406
Expand 10 papers
models
10
Sort: Recently updated
Keven16/DeepCritic-7B-RL1.5-PRM800K
8B
•
Updated
Jun 25
•
13
Keven16/DeepCritic-7B-RL1.5-Numina
8B
•
Updated
Jun 23
•
7
Keven16/Qwen2.5-32B-TOPS-Iter-DPO-Preview
33B
•
Updated
May 15
•
4
Keven16/Qwen2.5-32B-TOPS
33B
•
Updated
May 15
•
1
Keven16/Qwen2.5-32B-TOPS-Iter-DPO
33B
•
Updated
May 15
•
1
Keven16/Qwen2.5-32B-Tag
33B
•
Updated
May 15
•
1
Keven16/LLaMA3.1-8B-Tag
8B
•
Updated
May 15
•
1
Keven16/DeepCritic-7B-RL-PRM800K
8B
•
Updated
May 12
•
1
Keven16/DeepCritic-7B-RL-Numina
8B
•
Updated
May 12
•
4
Keven16/DeepCritic-7B-SFT
8B
•
Updated
May 12
•
4
datasets
2
Sort: Recently updated
Keven16/DeepCritic-RL-Data
Viewer
•
Updated
May 13
•
55k
•
6
Keven16/DeepCritic-4.5K
Preview
•
Updated
May 13
•
12