Paper Detail

Latent Preference Modeling for Cross-Session Personalized Tool Calling

Yejin Yoon, Minseo Kim, Taeuk Kim

huggingface Score 14.5

Published 2026-04-20 · First seen 2026-04-21

General AI

Abstract

Users often omit essential details in their requests to LLM-based agents, resulting in under-specified inputs for tool use. This poses a fundamental challenge for tool-augmented agents, as API execution typically requires complete arguments, highlighting the need for personalized tool calling. To study this problem, we introduce MPT, a benchmark comprising 265 multi-session dialogues that cover three challenges: Preference Recall, Preference Induction, and Preference Transfer. We also propose PRefine, a test-time memory-augmented method that represents user preferences as evolving hypotheses. Through a generate--verify--refine loop, it extracts reusable constraints from history and improves tool-calling accuracy while using only 1.24% of the tokens required by full-history prompting. These results indicate that robust personalization in agentic systems depends on memory that captures the reasons behind user choices, not just the choices themselves.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{yoon2026latent,
  title = {Latent Preference Modeling for Cross-Session Personalized Tool Calling},
  author = {Yejin Yoon and Minseo Kim and Taeuk Kim},
  year = {2026},
  abstract = {Users often omit essential details in their requests to LLM-based agents, resulting in under-specified inputs for tool use. This poses a fundamental challenge for tool-augmented agents, as API execution typically requires complete arguments, highlighting the need for personalized tool calling. To study this problem, we introduce MPT, a benchmark comprising 265 multi-session dialogues that cover three challenges: Preference Recall, Preference Induction, and Preference Transfer. We also propose PR},
  url = {https://huggingface.co/papers/2604.17886},
  keywords = {tool-augmented agents, API execution, personalized tool calling, MPT benchmark, PRefine, test-time memory augmentation, generate--verify--refine loop, user preferences, multi-session dialogues, preference recall, preference induction, preference transfer, huggingface daily},
  eprint = {2604.17886},
  archiveprefix = {arXiv},
}

Metadata

{}