RLHF-And-Friends
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
1
models
16
RLHF-And-Friends/Llama-3.1-8B-SFT-Uch
Updated
•
2
RLHF-And-Friends/TLDR-Llama-3.1-8B-SmallSFT-PPO
Text Generation
•
Updated
•
9
RLHF-And-Friends/TLDR-Llama-3.1-8B-SmallSFT
Text Generation
•
Updated
•
30
RLHF-And-Friends/TLDR-Llama-3.1-8B-Base-PPO
Text Generation
•
Updated
•
22
RLHF-And-Friends/TLDR-Llama-3.1-8B-SmallSFT-RM
Text Classification
•
Updated
•
9
RLHF-And-Friends/TLDR-Mistral-7B-Base-GRPO
Updated
•
5
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-RM
Text Classification
•
Updated
•
2
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-PPO
Text Generation
•
Updated
•
26
RLHF-And-Friends/TLDR-Mistral-7B-Base-PPO
Updated
•
2
RLHF-And-Friends/TLDR-Mistral-7B-Base-CoPPO
Updated
•
3
datasets
15
RLHF-And-Friends/Humans-vs-Llama-SmallSFT-PPO
Viewer
•
Updated
•
1k
•
52
RLHF-And-Friends/ultrachat-preprocessed
Viewer
•
Updated
•
515k
•
59
RLHF-And-Friends/Humans-vs-Llama-Base-PPO
Viewer
•
Updated
•
1k
•
43
RLHF-And-Friends/Human-vs-Shapa-8x
Viewer
•
Updated
•
1k
•
73
RLHF-And-Friends/Human-vs-Shapa-4x
Viewer
•
Updated
•
1k
•
56
RLHF-And-Friends/Human-vs-Shapa-2x
Viewer
•
Updated
•
1k
•
56
RLHF-And-Friends/SFT-vs-Shapa-CoPPO-8x
Viewer
•
Updated
•
100
•
79
RLHF-And-Friends/SFT-vs-Shapa-CoPPO-4x
Viewer
•
Updated
•
100
•
74
RLHF-And-Friends/SFT-vs-Shapa-CoPPO-2x
Viewer
•
Updated
•
100
•
69
RLHF-And-Friends/SFT-vs-BaseGRPO
Viewer
•
Updated
•
100
•
58