RLHF-And-Friends
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
2
-
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a0
Text Generation • Updated • 28 -
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a1
Text Generation • Updated • 29 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a0
Text Generation • Updated • 34 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a1
Text Generation • Updated • 34
models
19
RLHF-And-Friends/RM-TLDR-TLDR-Mistral-7B-SmallSFT
Text Classification
•
Updated
•
13
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-PPO
Text Generation
•
Updated
•
22
RLHF-And-Friends/TLDR-Mistral-7B-Base-PPO
Updated
•
23
RLHF-And-Friends/TLDR-Mistral-7B-Base-CoPPO
Updated
•
17
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-CoPPO
Text Generation
•
Updated
•
18
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT
Text Generation
•
Updated
•
53
RLHF-And-Friends/RM-TLDR-SFT-TLDR-Mistral-7B-v0.2
Text Classification
•
Updated
•
16
RLHF-And-Friends/TLDR-Mistral-7B-SFT-PPO
Text Generation
•
Updated
•
41
RLHF-And-Friends/TLDR-Mistral-7B-SFT
Text Generation
•
Updated
•
108
RLHF-And-Friends/SFT-TLDR-Mistral-7B-v0.2
Text Generation
•
Updated
•
62
datasets
5
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-Base-CoPPO-completions
Viewer
•
Updated
•
100
•
49
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-SmallSFT-CoPPO-completions
Viewer
•
Updated
•
100
•
49
RLHF-And-Friends/tldr-ppo
Viewer
•
Updated
•
110k
•
99
RLHF-And-Friends/tldr-sft
Viewer
•
Updated
•
22k
•
70
RLHF-And-Friends/tldr-preference
Viewer
•
Updated
•
265k
•
72