Hugh Zhang's picture

2 1

Hugh Zhang

hugh-scale

·

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

Humanity's Last Exam

authored a paper 6 months ago

Chain-of-Thought Reasoning is a Policy Improvement Operator

authored a paper 6 months ago

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

View all activity

Organizations

hugh-scale's activity

authored a paper about 2 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 66

authored 4 papers 6 months ago

Chain-of-Thought Reasoning is a Policy Improvement Operator

Paper • 2309.08589 • Published Sep 15, 2023 • 1

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Paper • 2402.14688 • Published Feb 22, 2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Paper • 2406.04520 • Published Jun 6, 2024 • 14

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Paper • 2408.15221 • Published Aug 27, 2024

updated a dataset 7 months ago

codegenning/B_livecodebench_C

Viewer • Updated Aug 16, 2024 • 174 • 61

New activity in meta-llama/Llama-3.1-8B-Instruct 8 months ago

Update chat template

#53 opened 8 months ago by

authored a paper 11 months ago

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1, 2024 • 32

upvoted a paper 11 months ago

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1, 2024 • 32

updated a dataset about 1 year ago

hugh-scale/hugh

Updated Feb 22, 2024 • 10