Paper Detail

Therefore I am. I Think

Esakkivel Esakkiraja, Sai Rajeswar, Denis Akhiyarov, Rajagopal Venkatesaramani

huggingface Score 8.0

Published 2026-04-02 · First seen 2026-04-04

General AI

Abstract

We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a simple linear probe successfully decodes tool-calling decisions from pre-generation activations with very high confidence, and in some cases, even before a single reasoning token is produced. Activation steering supports this causally: perturbing the decision direction leads to inflated deliberation, and flips behavior in many examples (between 7 - 79% depending on model and benchmark). We also show through behavioral analysis that, when steering changes the decision, the chain-of-thought process often rationalizes the flip rather than resisting it. Together, these results suggest that reasoning models can encode action choices before they begin to deliberate in text.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{esakkiraja2026therefore,
  title = {Therefore I am. I Think},
  author = {Esakkivel Esakkiraja and Sai Rajeswar and Denis Akhiyarov and Rajagopal Venkatesaramani},
  year = {2026},
  abstract = {We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a simple linear probe successfully decodes tool-calling decisions from pre-generation activations with very high confidence, and in some cases, even before a single reasoning token is produced. Activati},
  url = {https://huggingface.co/papers/2604.01202},
  keywords = {chain-of-thought, linear probe, activation steering, tool-calling decisions, pre-generation activations, deliberation, behavioral analysis, huggingface daily},
  eprint = {2604.01202},
  archiveprefix = {arXiv},
}

Metadata

{}