Michal Valko's picture

Open to Collab

2 2 1

Michal Valko

misovalko

·

https://misovalko.github.io/

AI & ML interests

large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models

Recent Activity

upvoted a paper 11 days ago

A General Theoretical Paradigm to Understand Learning from Human Preferences

authored a paper 11 days ago

Optimal Design for Reward Modeling in RLHF

authored a paper 11 days ago

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

View all activity

Organizations

upvoted a paper 11 days ago

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 19

upvoted a paper 7 months ago

Accelerating Nash Learning from Human Feedback via Mirror Prox

Paper • 2505.19731 • Published May 26 • 6