Paper Detail

CLaaS: Continual learning as a service for sample efficient online learning

Kion Fallah, Silen Naihin, Barak Widawsky, Qingqing Mao

Browse

Workflow Queues

arxiv Score 25.0

Published 2026-06-04 · First seen 2026-06-05

Research Track A · General AI

Open paper source

Abstract

Deployed large language model agents must adapt to distribution shift in dynamic environments. Ideally, adaptation can be performed from accumulated agent experiences and retain prior capabilities while transferring to future tasks. However, agent actions and environmental transitions can only be sampled once per scenario, as real-world environments cannot be trivially reset. To this end, we investigate an experiential and online continual learning setting in which agents learn from a stream of scenarios. We propose continual learning as-a-service (CLaaS), a system which enables agents to improve during deployment, abstracted behind a chat API. To increase sample efficiency, CLaaS stores rollouts in an experience replay buffer for gradient reuse during asynchronous training. We evaluate CLaaS on an adversarial task, demonstrating that parametric updates lead to superior forward transfer and less forgetting than in-context learning, with replay being a critical choice for sample efficiency.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: now
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@article{fallah2026claas,
  title = {CLaaS: Continual learning as a service for sample efficient online learning},
  author = {Kion Fallah and Silen Naihin and Barak Widawsky and Qingqing Mao},
  year = {2026},
  abstract = {Deployed large language model agents must adapt to distribution shift in dynamic environments. Ideally, adaptation can be performed from accumulated agent experiences and retain prior capabilities while transferring to future tasks. However, agent actions and environmental transitions can only be sampled once per scenario, as real-world environments cannot be trivially reset. To this end, we investigate an experiential and online continual learning setting in which agents learn from a stream of },
  url = {https://arxiv.org/abs/2606.05559},
  keywords = {cs.LG},
  eprint = {2606.05559},
  archiveprefix = {arXiv},
}

Metadata

{}