RLHF-And-Friends
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
2
-
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a0
Text Generation • Updated • 43 -
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a1
Text Generation • Updated • 42 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a0
Text Generation • Updated • 47 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a1
Text Generation • Updated • 46
models
19
RLHF-And-Friends/RM-TLDR-TLDR-Mistral-7B-SmallSFT
Text Classification
•
Updated
•
5
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-PPO
Text Generation
•
Updated
•
12
RLHF-And-Friends/TLDR-Mistral-7B-Base-PPO
Updated
•
13
RLHF-And-Friends/TLDR-Mistral-7B-Base-CoPPO
Updated
•
7
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-CoPPO
Text Generation
•
Updated
•
11
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT
Text Generation
•
Updated
•
37
RLHF-And-Friends/RM-TLDR-SFT-TLDR-Mistral-7B-v0.2
Text Classification
•
Updated
•
8
RLHF-And-Friends/TLDR-Mistral-7B-SFT-PPO
Text Generation
•
Updated
•
27
RLHF-And-Friends/TLDR-Mistral-7B-SFT
Text Generation
•
Updated
•
89
RLHF-And-Friends/SFT-TLDR-Mistral-7B-v0.2
Text Generation
•
Updated
•
50
datasets
5
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-Base-CoPPO-completions
Viewer
•
Updated
•
100
•
7
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-SmallSFT-CoPPO-completions
Viewer
•
Updated
•
100
•
7
RLHF-And-Friends/tldr-ppo
Viewer
•
Updated
•
110k
•
38
RLHF-And-Friends/tldr-sft
Viewer
•
Updated
•
22k
•
33
RLHF-And-Friends/tldr-preference
Viewer
•
Updated
•
265k
•
45