O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
Abstract
Recently, long-thought reasoning LLMs, such as OpenAI's O1, adopt extended reasoning processes similar to how humans ponder over complex problems. This reasoning paradigm significantly enhances the model's problem-solving abilities and has achieved promising results. However, long-thought reasoning process leads to a substantial increase in inference time. A pressing challenge is reducing the inference overhead of long-thought LLMs while ensuring accuracy. In this paper, we experimentally demonstrate that long-thought reasoning models struggle to effectively allocate token budgets based on problem difficulty and reasoning redundancies. To address this, we propose Length-Harmonizing Fine-Tuning (O1-Pruner), aiming at minimizing reasoning overhead while maintaining accuracy. This effective fine-tuning method first estimates the LLM's baseline performance through pre-sampling and then uses RL-style fine-tuning to encourage the model to generate shorter reasoning processes under accuracy constraints. This allows the model to achieve efficient reasoning with lower redundancy while maintaining accuracy. Experiments on various mathematical reasoning benchmarks show that O1-Pruner not only significantly reduces inference overhead but also achieves higher accuracy, providing a novel and promising solution to this challenge. Our code is coming soon at https://github.com/StarDewXXX/O1-Pruner
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs (2024)
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling (2025)
- Token-Budget-Aware LLM Reasoning (2024)
- Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning (2024)
- Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability (2024)
- Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning (2024)
- Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper