arxiv:2602.02160

D-CORE: Incentivizing Task Decomposition in Large Reasoning Models for Complex Tool Use

Published on Feb 2

· Submitted by

bowen xu on Feb 5

alibaba-inc

Upvote

Authors:

Bowen Xu ,

Abstract

A two-stage training framework called D-CORE is proposed to improve large reasoning models' ability to decompose complex tasks and compose reasoning processes, achieving superior performance in tool-use benchmarks.

AI-generated summary

Effective tool use and reasoning are essential capabilities for large reasoning models~(LRMs) to address complex real-world problems. Through empirical analysis, we identify that current LRMs lack the capability of sub-task decomposition in complex tool use scenarios, leading to Lazy Reasoning. To address this, we propose a two-stage training framework D-CORE~(\textbf{D}ecomposing tasks and \textbf{Co}mposing \textbf{Re}asoning processes) that first incentivize the LRMs' task decomposition reasoning capability via self-distillation, followed by diversity-aware reinforcement learning~(RL) to restore LRMs' reflective reasoning capability. D-CORE achieves robust tool-use improvements across diverse benchmarks and model scales. Experiments on BFCLv3 demonstrate superiority of our method: D-CORE-8B reaches 77.7\% accuracy, surpassing the best-performing 8B model by 5.7\%. Meanwhile, D-CORE-14B establishes a new state-of-the-art at 79.3\%, outperforming 70B models despite being 5times smaller. The source code is available at https://github.com/alibaba/EfficientAI.

View arXiv page View PDF GitHub 6 Add to collection

Community

tigerxiao

about 14 hours ago

good job , awesome boys !

bowiehsu

Paper author about 14 hours ago

You are welcome!

Lutalica

about 13 hours ago

good paper, best bowen !

zhaode

about 12 hours ago

good paper

bowiehsu

Paper author Paper submitter about 9 hours ago

Large Reasoning Models have achieved significant success in mathematical and reasoning tasks. We investigate whether this success can be replicated in complex tool use scenarios.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.02160 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.02160 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.