Paper Detail

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Jian Yang, Shawn Guo, Wei Zhang, Tianyu Zheng, Yaxin Du, Haau-Sing Li, Jiajun Wu, Yue Song, Yan Xing, Qingsong Cai, Zelong Huang, Chuan Hao, Ran Tao, Xianglong Liu, Wayne Xin Zhao, Mingjie Tang, Weifeng Lv, Ming Zhou, Bryan Dai

Browse

Workflow Queues

huggingface Score 17.5

Published 2026-06-16 · First seen 2026-06-17

General AI

Open paper source

Abstract

Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window attention, making loop count a practical design choice. We therefore study PLT loop-count selection through a gain--cost view: an extra loop may refine representations, but CLP also introduces a positional mismatch at each loop boundary. We instantiate this study by training LoopCoder-v2, a family of 7B PLT coders with different loop counts, from scratch on 18T tokens, followed by matched instruction tuning and evaluation. Empirically, the two-loop variant delivers broad gains over the non-looped baseline across code generation, code reasoning, agentic software engineering, and tool-use benchmarks, improving SWE-bench Verified from 43.0 to 64.4 points and Multi-SWE from 14.0 to 31.0 points. In contrast, variants with three or more loops regress, revealing a strongly non-monotonic loop-count effect. Our diagnostics show that loop 2 provides the main productive refinement, while later loops yield diminishing, oscillatory updates and reduced representational diversity. Because the CLP-induced mismatch remains roughly fixed as refinement gains shrink, the offset cost increasingly dominates. This gain--cost trade-off explains PLT's saturation at two loops and provides diagnostics for loop-count selection.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: now
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@misc{yang2026loopcoder,
  title = {LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling},
  author = {Jian Yang and Shawn Guo and Wei Zhang and Tianyu Zheng and Yaxin Du and Haau-Sing Li and Jiajun Wu and Yue Song and Yan Xing and Qingsong Cai and Zelong Huang and Chuan Hao and Ran Tao and Xianglong Liu and Wayne Xin Zhao and Mingjie Tang and Weifeng Lv and Ming Zhou and Bryan Dai},
  year = {2026},
  abstract = {Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window attention, making loop count a practical design choice. We therefore study PLT loop-count selection through a gain--cost view: an extra loop may refine representations, but CLP also introduces a positional},
  url = {https://huggingface.co/papers/2606.18023},
  keywords = {Looped Transformers, parallel loop Transformers, cross-loop position offsets, shared-KV gated sliding-window attention, loop-count selection, LoopCoder-v2, instruction tuning, SWE-bench, Multi-SWE, huggingface daily},
  eprint = {2606.18023},
  archiveprefix = {arXiv},
}

Metadata

{}