Paper Detail

Generalization in LLM Problem Solving: The Case of the Shortest Path

Yao Tong, Jiayuan Ye, Anastasia Borovykh, Reza Shokri

Browse

Workflow Queues

arxiv Score 8.3

Published 2026-04-16 · First seen 2026-04-17

General AI

Open paper source

Abstract

Whether language models can systematically generalize remains actively debated. Yet empirical performance is jointly shaped by multiple factors such as training data, training paradigms, and inference-time strategies, making failures difficult to interpret. We introduce a controlled synthetic environment based on shortest-path planning, a canonical composable sequential optimization problem. The setup enables clean separation of these factors and supports two orthogonal axes of generalization: spatial transfer to unseen maps and length scaling to longer-horizon problems. We find that models exhibit strong spatial transfer but consistently fail under length scaling due to recursive instability. We further analyze how distinct stages of the learning pipeline influence systematic problem-solving: for example, data coverage sets capability limits; reinforcement learning improves training stability but does not expand those limits; and inference-time scaling enhances performance but cannot rescue length-scaling failures.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: soon
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@article{tong2026generalization,
  title = {Generalization in LLM Problem Solving: The Case of the Shortest Path},
  author = {Yao Tong and Jiayuan Ye and Anastasia Borovykh and Reza Shokri},
  year = {2026},
  abstract = {Whether language models can systematically generalize remains actively debated. Yet empirical performance is jointly shaped by multiple factors such as training data, training paradigms, and inference-time strategies, making failures difficult to interpret. We introduce a controlled synthetic environment based on shortest-path planning, a canonical composable sequential optimization problem. The setup enables clean separation of these factors and supports two orthogonal axes of generalization: s},
  url = {https://arxiv.org/abs/2604.15306},
  keywords = {cs.AI, cs.LG, Generalization, Scaling, Computer science, Pipeline (software), Stability (learning theory)},
  eprint = {2604.15306},
  archiveprefix = {arXiv},
}

Metadata

{}