Ah, got it! My bad! Thanks!
Vijeta Deshpande
vijetadeshpande
·
AI & ML interests
NLP, Healthcare
Recent Activity
commented on
an
article
12 days ago
The N Implementation Details of RLHF with PPO
commented on
an
article
12 days ago
The N Implementation Details of RLHF with PPO
liked
a model
about 2 months ago
mistralai/Mistral-Small-Instruct-2409
Organizations
vijetadeshpande's activity
commented on
The N Implementation Details of RLHF with PPO
12 days ago
commented on
The N Implementation Details of RLHF with PPO
12 days ago
I am confused about the whitening function.
Shouldn't it be as follows?
whitened = (values - mean) / torch.rsqrt(var + 1e-8)
But the function here has,
whitened = (values - mean) * torch.rsqrt(var + 1e-8)
Citation?
#3 opened 6 months ago
by
vijetadeshpande