Paper Detail

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

Sizhe Tang, Zuyuan Zhang, Mahdi Imani, Tian Lan

arxiv Score 6.2

Published 2026-05-01 · First seen 2026-05-04

General AI

Abstract

Monte Carlo Tree Search (MCTS) scales poorly in cooperative multi-agent domains because expansion must consider an exponentially large set of joint actions, severely limiting exploration under realistic search budgets. We propose NonZero, which keeps multi-agent MCTS tractable by running surrogate-guided selection over a low-dimensional nonlinear representation using an interaction-guided proposal rule, instead of directly exploring the full joint-action space. Our exploration uses an interaction score: single-agent deviations are ranked by predicted gain, while two-agent deviations are scored by a mixed-difference measure that reveals coordination benefits even when no single agent can improve alone. We formalize candidate proposal as a bandit problem over local deviations and derive a proposal rule, NonZero, with a sublinear local-regret guarantee for reaching approximate graph-local optima without enumerating the joint-action space. Empirically, NonZero improves sample efficiency and final performance on MatGame, SMAC, and SMACv2 relative to strong model-based and model-free baselines under matched search budgets.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
later
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{tang2026nonzero,
  title = {NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search},
  author = {Sizhe Tang and Zuyuan Zhang and Mahdi Imani and Tian Lan},
  year = {2026},
  abstract = {Monte Carlo Tree Search (MCTS) scales poorly in cooperative multi-agent domains because expansion must consider an exponentially large set of joint actions, severely limiting exploration under realistic search budgets. We propose NonZero, which keeps multi-agent MCTS tractable by running surrogate-guided selection over a low-dimensional nonlinear representation using an interaction-guided proposal rule, instead of directly exploring the full joint-action space. Our exploration uses an interactio},
  url = {https://arxiv.org/abs/2605.00751},
  keywords = {cs.LG},
  eprint = {2605.00751},
  archiveprefix = {arXiv},
}

Metadata

{}