ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n16-sample4-iter1-step_9 2B • Updated Mar 24
ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n8-sample4-iter4-step_9 2B • Updated Mar 22