Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 6 days ago • 64
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions Paper • 2410.02743 • Published Oct 3, 2024 • 7
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published 25 days ago • 85
NILE: Internal Consistency Alignment in Large Language Models Paper • 2412.16686 • Published 22 days ago • 8
Reliable, Reproducible, and Really Fast Leaderboards with Evalica Paper • 2412.11314 • Published 28 days ago • 2
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 41
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published Nov 25, 2024 • 37
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 58
Response Tuning: Aligning Large Language Models without Instruction Paper • 2410.02465 • Published Oct 3, 2024 • 12
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7, 2024 • 13
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References Paper • 2410.05193 • Published Oct 7, 2024 • 13
Collaborative Performance Prediction for Large Language Models Paper • 2407.01300 • Published Jul 1, 2024 • 2