A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Paper • 2502.01618 • Published Feb 3 • 10
gx-ai-architect/numinamath-178k-phi4-bon-verified-dpo-trl-40k-old-r1-format Viewer • Updated Feb 3 • 39k • 65
gx-ai-architect/numinamath-178k-phi4-bon-verified-dpo-trl-40k-old-r1-format Viewer • Updated Feb 3 • 39k • 65
gx-ai-architect/official_dpo_r1_prompt_bo8_random_rej_balanced_fixed Viewer • Updated Feb 2 • 59.4k • 90
gx-ai-architect/official_dpo_r1_prompt_bo8_random_rej_balanced_fixed Viewer • Updated Feb 2 • 59.4k • 90