kenhktsui
/

Qwen-0.5B-GRPO

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

Qwen-0.5B-GRPO / vocab.json

kenhktsui's picture

kenhktsui/Qwen-0.5B-GRPO-gsm8k-correct-reward

a000783 verified about 2 months ago

history contribute delete

2.78 MB

File too large to display, you can check the raw version instead.