3 24 13

DeyangKong

DeyangKong

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 7 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

upvoted a paper 12 days ago

DEER: Draft with Diffusion, Verify with Autoregressive Models

liked a model about 1 month ago

sentence-transformers/all-MiniLM-L6-v2

View all activity

Organizations

upvoted a paper 7 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 8 days ago • 60

upvoted a paper 12 days ago

DEER: Draft with Diffusion, Verify with Autoregressive Models

Paper • 2512.15176 • Published 14 days ago • 41

liked 2 models about 1 month ago

sentence-transformers/all-MiniLM-L6-v2

OpenMOSS-Team/DiRL-8B-Instruct

8B • Updated about 22 hours ago • 39 • 10

liked a dataset about 2 months ago

microsoft/rStar-Coder

Viewer • Updated Jul 20 • 1.86M • 4.86k • 219

liked a model 2 months ago

Buchilaguo/ATF-8B

8B • Updated Oct 21 • 54 • 1

liked a model 4 months ago

meituan-longcat/LongCat-Flash-Chat

Text Generation • 562B • Updated Sep 24 • 22.6k • 513

upvoted 2 papers 7 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 263

Skywork Open Reasoner 1 Technical Report

Paper • 2505.22312 • Published May 28 • 54

authored a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23 • 6

upvoted a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23 • 6

commented a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23 • 6 •

upvoted a paper 9 months ago

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7 • 11

liked a dataset 9 months ago

lime-nlp/DeepScaleR_Difficulty

Viewer • Updated Apr 10 • 5.06M • 206 • 9

liked a model 9 months ago

agentica-org/DeepCoder-14B-Preview

Text Generation • 15B • Updated May 11 • 1.76k • • 681

upvoted 2 papers 9 months ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published Mar 24 • 31

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20 • 77

upvoted 2 papers 10 months ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 144

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3 • 10

commented a paper 10 months ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3 • 10 •

DeyangKong

AI & ML interests

Recent Activity

Organizations

DeyangKong's activity