Paper Detail

CLaaS: Continual learning as a service for sample efficient online learning

Kion Fallah, Silen Naihin, Barak Widawsky, Qingqing Mao

arxiv Score 25.0

Published 2026-06-04 · First seen 2026-06-05

Research Track A · General AI

Abstract

Deployed large language model agents must adapt to distribution shift in dynamic environments. Ideally, adaptation can be performed from accumulated agent experiences and retain prior capabilities while transferring to future tasks. However, agent actions and environmental transitions can only be sampled once per scenario, as real-world environments cannot be trivially reset. To this end, we investigate an experiential and online continual learning setting in which agents learn from a stream of scenarios. We propose continual learning as-a-service (CLaaS), a system which enables agents to improve during deployment, abstracted behind a chat API. To increase sample efficiency, CLaaS stores rollouts in an experience replay buffer for gradient reuse during asynchronous training. We evaluate CLaaS on an adversarial task, demonstrating that parametric updates lead to superior forward transfer and less forgetting than in-context learning, with replay being a critical choice for sample efficiency.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{fallah2026claas,
  title = {CLaaS: Continual learning as a service for sample efficient online learning},
  author = {Kion Fallah and Silen Naihin and Barak Widawsky and Qingqing Mao},
  year = {2026},
  abstract = {Deployed large language model agents must adapt to distribution shift in dynamic environments. Ideally, adaptation can be performed from accumulated agent experiences and retain prior capabilities while transferring to future tasks. However, agent actions and environmental transitions can only be sampled once per scenario, as real-world environments cannot be trivially reset. To this end, we investigate an experiential and online continual learning setting in which agents learn from a stream of },
  url = {https://arxiv.org/abs/2606.05559},
  keywords = {cs.LG},
  eprint = {2606.05559},
  archiveprefix = {arXiv},
}

Metadata

{}