Paper Detail

LightThinker++: From Reasoning Compression to Memory Management

Yuqi Zhu, Jintian Zhang, Zhenjie Wan, Yujie Luo, Shuofei Qiao, Zhengke Gui, Da Zheng, Lei Liang, Huajun Chen, Ningyu Zhang

Browse

Workflow Queues

huggingface Score 15.5

Published 2026-04-04 · First seen 2026-04-07

General AI

Open paper source

Abstract

Large language models (LLMs) excel at complex reasoning, yet their efficiency is limited by the surging cognitive overhead of long thought traces. In this paper, we propose LightThinker, a method that enables LLMs to dynamically compress intermediate thoughts into compact semantic representations. However, static compression often struggles with complex reasoning where the irreversible loss of intermediate details can lead to logical bottlenecks. To address this, we evolve the framework into LightThinker++, introducing Explicit Adaptive Memory Management. This paradigm shifts to behavioral-level management by incorporating explicit memory primitives, supported by a specialized trajectory synthesis pipeline to train purposeful memory scheduling. Extensive experiments demonstrate the framework's versatility across three dimensions. (1) LightThinker reduces peak token usage by 70% and inference time by 26% with minimal accuracy loss. (2) In standard reasoning, LightThinker++ slashes peak token usage by 69.9% while yielding a +2.42% accuracy gain under the same context budget for maximum performance. (3) Most notably, in long-horizon agentic tasks, it maintains a stable footprint beyond 80 rounds (a 60%-70% reduction), achieving an average performance gain of 14.8% across different complex scenarios. Overall, our work provides a scalable direction for sustaining deep LLM reasoning over extended horizons with minimal overhead.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: now
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@misc{zhu2026lightthinker,
  title = {LightThinker++: From Reasoning Compression to Memory Management},
  author = {Yuqi Zhu and Jintian Zhang and Zhenjie Wan and Yujie Luo and Shuofei Qiao and Zhengke Gui and Da Zheng and Lei Liang and Huajun Chen and Ningyu Zhang},
  year = {2026},
  abstract = {Large language models (LLMs) excel at complex reasoning, yet their efficiency is limited by the surging cognitive overhead of long thought traces. In this paper, we propose LightThinker, a method that enables LLMs to dynamically compress intermediate thoughts into compact semantic representations. However, static compression often struggles with complex reasoning where the irreversible loss of intermediate details can lead to logical bottlenecks. To address this, we evolve the framework into Lig},
  url = {https://huggingface.co/papers/2604.03679},
  keywords = {large language models, intermediate thoughts, semantic representations, explicit adaptive memory management, memory primitives, trajectory synthesis pipeline, memory scheduling, long-horizon agentic tasks, token usage, inference time, huggingface daily},
  eprint = {2604.03679},
  archiveprefix = {arXiv},
}

Metadata

{}