Paper Detail

Reward-Based Online LLM Routing via NeuralUCB

Ming-Hua Tsai, Phat Tran

arxiv Score 7.8

Published 2026-03-31 · First seen 2026-04-01

General AI

Abstract

This study investigates the use of NeuralUCB for cost-aware large language model (LLM) routing. Existing routing approaches can be broadly grouped into supervised routing methods and partial-feedback methods, each with different tradeoffs in efficiency and adaptivity. We implement a NeuralUCB-based routing policy and evaluate it on RouterBench under a simulated online setting. Experimental results show that the proposed method consistently outperforms random and min-cost baselines in utility reward. Compared with the max-quality reference, our method achieves substantially lower inference cost while maintaining competitive reward. These findings suggest that NeuralUCB is a promising approach for cost-aware LLM routing, while also highlighting remaining challenges in action discrimination and exploration.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{tsai2026reward,
  title = {Reward-Based Online LLM Routing via NeuralUCB},
  author = {Ming-Hua Tsai and Phat Tran},
  year = {2026},
  abstract = {This study investigates the use of NeuralUCB for cost-aware large language model (LLM) routing. Existing routing approaches can be broadly grouped into supervised routing methods and partial-feedback methods, each with different tradeoffs in efficiency and adaptivity. We implement a NeuralUCB-based routing policy and evaluate it on RouterBench under a simulated online setting. Experimental results show that the proposed method consistently outperforms random and min-cost baselines in utility rew},
  url = {https://arxiv.org/abs/2603.30035},
  keywords = {cs.LG, cs.CL},
  eprint = {2603.30035},
  archiveprefix = {arXiv},
}

Metadata

{}