Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
treasure4l
/
Llama3.2-Instruct-DPO
like
0
Safetensors
trl-lib/ultrafeedback_binarized
arxiv:
1910.09700
Model card
Files
Files and versions
Community
main
Llama3.2-Instruct-DPO
/
tokenizer.json
Commit History
Upload 6 files
fb37779
verified
treasure4l
commited on
Jan 14