Paper Detail

Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models

Lok-Lam Ieong, Chia-Chien Chen, Chih-Kai Yang, Yu-Han Huang, An-Yu Cheng, Hung-yi Lee

Browse

Workflow Queues

huggingface Score 10.0

Published 2026-03-15 · First seen 2026-03-27

General AI

Open paper source

Abstract

Chain-of-thought (CoT) prompting has been extended to large audio-language models (LALMs) to elicit reasoning, yet enhancing its effectiveness without training remains challenging. We study inference-time model steering as a training-free approach to improve LALM reasoning. We introduce three strategies using diverse information sources and evaluate them across four LALMs and four benchmarks. Results show general accuracy gains up to 4.4% over CoT prompting. Notably, we identify a cross-modal transfer where steering vectors derived from few text samples effectively guide speech-based reasoning, demonstrating high data efficiency. We also examine hyperparameter sensitivity to understand the robustness of these approaches. Our findings position model steering as a practical direction for strengthening LALM reasoning.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: soon
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@misc{ieong2026nudging,
  title = {Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models},
  author = {Lok-Lam Ieong and Chia-Chien Chen and Chih-Kai Yang and Yu-Han Huang and An-Yu Cheng and Hung-yi Lee},
  year = {2026},
  abstract = {Chain-of-thought (CoT) prompting has been extended to large audio-language models (LALMs) to elicit reasoning, yet enhancing its effectiveness without training remains challenging. We study inference-time model steering as a training-free approach to improve LALM reasoning. We introduce three strategies using diverse information sources and evaluate them across four LALMs and four benchmarks. Results show general accuracy gains up to 4.4\% over CoT prompting. Notably, we identify a cross-modal tr},
  url = {https://huggingface.co/papers/2603.14636},
  keywords = {chain-of-thought prompting, large audio-language models, inference-time model steering, steering vectors, cross-modal transfer, few-shot learning, huggingface daily},
  eprint = {2603.14636},
  archiveprefix = {arXiv},
}

Metadata

{}