llama-3-8b-instruct-vpo-iter2 / train_results.json
jcmei's picture
End of training
c6abd8b verified
{
"epoch": 1.0,
"total_flos": 0.0,
"train_loss": 0.6915523528288572,
"train_runtime": 2770.327,
"train_samples": 19958,
"train_samples_per_second": 7.204,
"train_steps_per_second": 0.113
}