Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning Paper • 2503.05641 • Published 20 days ago • 2
MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use Paper • 2502.15872 • Published Feb 21 • 5
MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use Paper • 2502.15872 • Published Feb 21 • 5 • 2
mlfoundations-dev/dpo_from_stratos_judged_annotated_rejected_responses Viewer • Updated Feb 5 • 51k • 97
mlfoundations-dev/dpo_from_stratos_judged_annotated_rejected_responses Viewer • Updated Feb 5 • 51k • 97
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement Paper • 2404.04627 • Published Apr 6, 2024
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback Paper • 2410.06215 • Published Oct 8, 2024
DataEnvGym Collection Skills, datasets, etc for DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback • 6 items • Updated Oct 10, 2024 • 1