Qwen2.5-3B-Instruct-grpo-E6-D100-L4096-lr5e7 / model-00001-of-00002.safetensors

Commit History

Training in progress, epoch 2
53f6b05
verified

chenggong1995 commited on

Training in progress, epoch 1
a672ead
verified

chenggong1995 commited on

Training in progress, epoch 0
0ecc16a
verified

chenggong1995 commited on