Paper Detail

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Ke Xu, Yuhao Wang, Yu Wang

arxiv Score 19.3

Published 2026-04-16 · First seen 2026-04-17

General AI

Abstract

Recent advancements in LLM agents are gradually shifting from reactive, text-based paradigms toward proactive, multimodal interaction. However, existing benchmarks primarily focus on reactive responses, overlooking the complexities of proactive intervention and monitoring. To bridge this gap, we introduce ProVoice-Bench, the first evaluation framework specifically designed for proactive voice agents, featuring four novel tasks. By leveraging a multi-stage data synthesis pipeline, we curate 1,182 high-quality samples for rigorous testing. Our evaluation of state-of-the-art Multimodal LLMs reveals a significant performance gap, particularly regarding over-triggering and reasoning capabilities. These findings highlight the limitations of current models and offer a roadmap for developing more natural, context-aware proactive agents.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{xu2026reactive,
  title = {From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench},
  author = {Ke Xu and Yuhao Wang and Yu Wang},
  year = {2026},
  abstract = {Recent advancements in LLM agents are gradually shifting from reactive, text-based paradigms toward proactive, multimodal interaction. However, existing benchmarks primarily focus on reactive responses, overlooking the complexities of proactive intervention and monitoring. To bridge this gap, we introduce ProVoice-Bench, the first evaluation framework specifically designed for proactive voice agents, featuring four novel tasks. By leveraging a multi-stage data synthesis pipeline, we curate 1,182},
  url = {https://arxiv.org/abs/2604.15037},
  keywords = {cs.AI, cs.CL, cs.SD},
  eprint = {2604.15037},
  archiveprefix = {arXiv},
}

Metadata

{}