Robin Williams PRO
bfuzzy1
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 12 hours ago
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
updated
a collection
about 12 hours ago
RL
upvoted
a
paper
about 12 hours ago
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Organizations
None yet
Collections
12
-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 85 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 45 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 29 -
Outcome-Refining Process Supervision for Code Generation
Paper • 2412.15118 • Published • 19
models
10
bfuzzy1/acheron-m1a-llama
Text Generation
•
Updated
•
4
bfuzzy1/acheron-m
Text Generation
•
Updated
•
172
bfuzzy1/acheron-d
Updated
•
23
bfuzzy1/llambses-1
Text Generation
•
Updated
•
109
bfuzzy1/acheron-o9
Updated
•
10
bfuzzy1/acheron
Updated
•
12
bfuzzy1/acheron-c
Updated
•
10
bfuzzy1/Gunny
Text Generation
•
Updated
•
215
bfuzzy1/llambses-1_4bit
Updated
•
7
bfuzzy1/acheron-x
Text Generation
•
Updated
•
6