HINT-lab
's Collections
Reward-Calibration
updated
HINT-lab/llama3-8b-final-ppo-c-v0.3
Text Generation
•
Updated
•
13
HINT-lab/mistral-7b-hermes-crm-skywork
HINT-lab/mistral-7b-hermes-cdpo-v0.2
Text Generation
•
Updated
•
21
HINT-lab/mistral-7b-ppo-clean-hermes
Text Generation
•
Updated
•
10
HINT-lab/mistral-7b-ppo-hermes-v0.3
Text Generation
•
Updated
•
13
•
1
HINT-lab/mistral-7b-ppo-m-hermes
Text Generation
•
Updated
•
13
HINT-lab/llama3-8b-cdpo-v0.2
Text Generation
•
Updated
•
10
HINT-lab/llama3-8b-final-ppo-v0.3
Text Generation
•
Updated
•
6
HINT-lab/mistral-7b-hermes-rm-skywork
Updated
•
17
HINT-lab/llama3-8b-final-ppo-m-v0.3
Text Generation
•
Updated
•
29
HINT-lab/llama3-8b-crm-final-v0.1
Updated
•
11
HINT-lab/llama3-8b-final-ppo-clean-v0.1
Text Generation
•
Updated
•
8
HINT-lab/mistral-7b-hermes-dpo-v0.2
Text Generation
•
Updated
•
8
HINT-lab/mistral-7b-ppo-c-hermes
Text Generation
•
Updated
•
15
HINT-lab/llama3-8b-dpo-v0.2
Text Generation
•
Updated
•
10