1 1 3

Yihua Zhang

NormalUhr

AI & ML interests

None yet

Recent Activity

published an article about 12 hours ago

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

published an article 5 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

published an article 8 days ago

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

View all activity

Organizations

NormalUhr's activity

published an article about 12 hours ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

•

about 12 hours ago

• 1

published an article 5 days ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

5 days ago

• 22

published an article 8 days ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

•

8 days ago

• 2

published an article 8 days ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

•

8 days ago

• 6

published an article 8 days ago

Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

•

8 days ago

• 4

upvoted an article 5 months ago

Article

Optimizing your LLM in production

Sep 15, 2023

• 15

New activity in OPTML-Group/UnlearnCanvas 8 months ago

NonMatchingSplitsSizeError

#2 opened 9 months ago by

yuyang-xue-ed

authored a paper 11 months ago

UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models

Paper • 2402.11846 • Published Feb 19, 2024 • 1

updated a dataset 11 months ago

OPTML-Group/UnlearnCanvas

Viewer • Updated Mar 6, 2024 • 1.76k • 852 • 2

liked a dataset 11 months ago

OPTML-Group/UnlearnCanvas

Viewer • Updated Mar 6, 2024 • 1.76k • 852 • 2

liked a Space 12 months ago

UnlearnCanvas Benchmark

🎨

liked a Space over 1 year ago

4.76k

MusicGen

🎵

Generate music from text and melody descriptions