Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ntyazh
/
content
like
0
Text Classification
Transformers
Safetensors
HumanLLMs/Human-Like-DPO-Dataset
llama
Generated from Trainer
trl
reward-trainer
text-generation-inference
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
content
/
wandb
1 contributor
History:
1 commit
ntyazh
ntyazh/llm-course-hw2-reward-model
93c48de
verified
6 months ago
run-20250308_171545-7fko8ii0
ntyazh/llm-course-hw2-reward-model
6 months ago
run-20250308_171757-n8xih8yp
ntyazh/llm-course-hw2-reward-model
6 months ago
run-20250308_172343-a0bl55x6
ntyazh/llm-course-hw2-reward-model
6 months ago
run-20250308_184138-6shluj8g
ntyazh/llm-course-hw2-reward-model
6 months ago
run-20250308_184458-6upxn2a6
ntyazh/llm-course-hw2-reward-model
6 months ago
run-20250308_185009-791795t2
ntyazh/llm-course-hw2-reward-model
6 months ago
debug-internal.log
2.57 kB
ntyazh/llm-course-hw2-reward-model
6 months ago
debug.log
17.5 kB
ntyazh/llm-course-hw2-reward-model
6 months ago