davidanugraha
/

LLaMA-3.2-3B-DPO-HelpSteer3-SkyworkLlama

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

LLaMA-3.2-3B-DPO-HelpSteer3-SkyworkLlama / training_rewards_accuracies.png

davidanugraha's picture

Upload folder using huggingface_hub

9497555 verified 16 days ago

history contribute delete

49.3 kB