Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Cijov
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
Training in progress, step 10
5e7fcee
verified
Cijov
commited on
18 days ago
Model save
975a05c
verified
Cijov
commited on
19 days ago
Training in progress, step 22
25baa55
verified
Cijov
commited on
19 days ago
Training in progress, step 20
4d821c8
verified
Cijov
commited on
19 days ago
Training in progress, step 10
4831623
verified
Cijov
commited on
19 days ago
Training in progress, step 20
8851ebc
verified
Cijov
commited on
19 days ago
Training in progress, step 10
c45954e
verified
Cijov
commited on
19 days ago
Training in progress, step 30
1782e64
verified
Cijov
commited on
19 days ago
Training in progress, step 20
8a59b42
verified
Cijov
commited on
19 days ago
Training in progress, step 10
af96c19
verified
Cijov
commited on
19 days ago
initial commit
50ac015
verified
Cijov
commited on
19 days ago