Papers
arxiv:2411.08147

Large Language Models Can Self-Improve in Long-context Reasoning

Published on Nov 12
· Submitted by Siheng99 on Nov 14
#1 Paper of the day
Authors:
,
,

Abstract

Large language models (LLMs) have achieved substantial progress in processing long contexts but still struggle with long-context reasoning. Existing approaches typically involve fine-tuning LLMs with synthetic data, which depends on annotations from human experts or advanced models like GPT-4, thus restricting further advancements. To address this issue, we investigate the potential for LLMs to self-improve in long-context reasoning and propose \ours, an approach specifically designed for this purpose. This approach is straightforward: we sample multiple outputs for each question, score them with Minimum Bayes Risk, and then apply supervised fine-tuning or preference optimization based on these outputs. Extensive experiments on several leading LLMs demonstrate the effectiveness of \ours, with an absolute improvement of 4.2 points for Llama-3.1-8B-Instruct. Furthermore, \ours achieves superior performance compared to prior approaches that depend on data produced by human experts or advanced models. We anticipate that this work will open new avenues for self-improvement techniques in long-context scenarios, which are essential for the continual advancement of LLMs.

Community

Paper author Paper submitter

Large Language Models Can Self-Improve in Long-context Reasoning


1️⃣ We examines the unexplored potential of LLMs for long-context reasoning by analyzing diverse prompting techniques and expanding generation spaces.
2️⃣ We propose a novel method, SeaLong, designed to facilitate self-improvement of LLMs in long-context reasoning.
3️⃣ Extensive experiments across five tasks demonstrate the effectiveness of SeaLong, underscoring the potential of self-improvement in advancing LLMs.

image.png

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

This comment has been hidden

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2411.08147 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2411.08147 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2411.08147 in a Space README.md to link it from this page.

Collections including this paper 13