Research Paper Cockpit

Research Track A

Papers on learning and model adaptation methods.

Daily Archives

Quick jump into generated daily digests.

Research Workflow

Latest digest: 2026-03-28.

Papers

19 visible entries

arxiv Score 31.5

Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning

2026-03-12 · Jiaheng Hu, Jay Shim, Chen Tang, Yoonchang Sung, Bo Liu, Peter Stone, Roberto Martin-Martin

Research Track A · General AI

Continual Reinforcement Learning (CRL) for Vision-Language-Action (VLA) models is a promising direction toward self-improving embodied agents that can adapt in openended, evolving environments. However, conventional wisdom from continual learning suggests that naive Sequential Fine-Tuning (Seq. FT) leads to catastrophi…

Review
pending
Role
unreviewed
Read
now
arxiv Score 25.9

Universe Routing: Why Self-Evolving Agents Need Epistemic Control

2026-03-16 · Zhaohui Geoffrey Wang

Research Track A · General AI

A critical failure mode of current lifelong agents is not lack of knowledge, but the inability to decide how to reason. When an agent encounters "Is this coin fair?" it must recognize whether to invoke frequentist hypothesis testing or Bayesian posterior inference - frameworks that are epistemologically incompatible. M…

Review
pending
Role
unreviewed
Read
now
arxiv Score 24.0

Continual Learning in Large Language Models: Methods, Challenges, and Opportunities

2026-03-13 · Hongyang Chen, Zhongwu Sun, Hongfei Ye, Kunchi Li, Xuemin Lin

Research Track A · General AI

Continual learning (CL) has emerged as a pivotal paradigm to enable large language models (LLMs) to dynamically adapt to evolving knowledge and sequential tasks while mitigating catastrophic forgetting-a critical limitation of the static pre-training paradigm inherent to modern LLMs. This survey presents a comprehensiv…

Review
pending
Role
unreviewed
Read
now
arxiv Score 21.6

ElephantBroker: A Knowledge-Grounded Cognitive Runtime for Trustworthy AI Agents

2026-03-26 · Cristian Lupascu, Alexandru Lupascu

Research Track A · General AI

Large Language Model based agents increasingly operate in high stakes, multi turn settings where factual grounding is critical, yet their memory systems typically rely on flat key value stores or plain vector retrieval with no mechanism to track the provenance or trustworthiness of stored knowledge. We present Elephant…

Review
pending
Role
unreviewed
Read
now
arxiv Score 19.4

All-day Multi-scenes Lifelong Vision-and-Language Navigation with Tucker Adaptation

2026-03-15 · Xudong Wang, Gan Li, Zhiyu Liu, Yao Wang, Lianqing Liu, Zhi Han

Research Track A · General AI

Deploying vision-and-language navigation (VLN) agents requires adaptation across diverse scenes and environments, but fine-tuning on a specific scenario often causes catastrophic forgetting in others, which severely limits flexible long-term deployment. We formalize this challenge as the all-day multi-scenes lifelong V…

Review
pending
Role
unreviewed
Read
now
arxiv Score 16.5

Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

2026-03-22 · Elif Ceren Gok Yildirim, Murat Onur Yildirim, Joaquin Vanschoren

Research Track A · General AI

The continual learning literature has rapidly shifted from traditional class incremental learning (CIL) techniques to foundation model (FM)-based CIL methods without a clear understanding of how these newer approaches compare to strong, lightweight convolutional baselines. This abrupt transition has created a substanti…

Review
pending
Role
unreviewed
Read
now
arxiv Score 16.4

Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces

2026-03-15 · Jiayuan Du, Yuebing Song, Yiming Zhao, Xianghui Pan, Jiawei Lian, Yuchu Lu, Liuyi Wang, Chengju Liu, Qijun Chen

Research Track A · General AI

End-to-End autonomous driving (E2E-AD) systems face challenges in lifelong learning, including catastrophic forgetting, difficulty in knowledge transfer across diverse scenarios, and spurious correlations between unobservable confounders and true driving intents. To address these issues, we propose DeLL, a Deconfounded…

Review
pending
Role
unreviewed
Read
now
arxiv Score 16.0

Lifelong Embodied Navigation Learning

2026-03-06 · Xudong Wang, Jiahua Dong, Baichen Liu, Qi Lyu, Lianqing Liu, Zhi Han

Research Track A · General AI

Embodied navigation agents powered by large language models have shown strong performance on individual tasks but struggle to continually acquire new navigation skills, which suffer from catastrophic forgetting. We formalize this challenge as lifelong embodied navigation learning (LENL), where an agent is required to a…

Review
pending
Role
unreviewed
Read
now
arxiv Score 15.0

Reframing Long-Tailed Learning via Loss Landscape Geometry

2026-03-22 · Shenghan Chen, Yiming Liu, Yanzhen Wang, Yujia Wang, Xiankai Lu

Research Track A · General AI

Balancing performance trade-off on long-tail (LT) data distributions remains a long-standing challenge. In this paper, we posit that this dilemma stems from a phenomenon called "tail performance degradation" (the model tends to severely overfit on head classes while quickly forgetting tail classes) and pose a solution …

Review
pending
Role
unreviewed
Read
soon
arxiv Score 14.6

SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation

2026-03-25 · Zhuoran Li, Zhiyang Li, Kaijun Zhou, Jinyu Gu

Research Track A · General AI

Despite the promise of Vision-Language-Action (VLA) models as generalist robotic controllers, their robustness against perceptual noise and environmental variations in out-of-distribution (OOD) tasks remains fundamentally limited by the absence of long-term memory, causal failure attribution, and dynamic intervention c…

Review
pending
Role
unreviewed
Read
now
arxiv Score 13.8

Evidence of an Emergent "Self" in Continual Robot Learning

2026-03-25 · Adidev Jhunjhunwala, Judah Goldfeder, Hod Lipson

Research Track A

A key challenge to understanding self-awareness has been a principled way of quantifying whether an intelligent system has a concept of a "self," and if so how to differentiate the "self" from other cognitive structures. We propose that the "self" can be isolated by seeking the invariant portion of cognitive process th…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 13.5

Bi-CRCL: Bidirectional Conservative-Radical Complementary Learning with Pre-trained Foundation Models for Class-incremental Medical Image Analysis

2026-03-24 · Xinyao Wu, Zhe Xu, Cheng Chen, Jiawei Ma, Yefeng Zheng, Raymond Kai-yu Tong

Research Track A · General AI

Class-incremental learning (CIL) in medical image-guided diagnosis requires retaining prior diagnostic knowledge while adapting to newly emerging disease categories, which is critical for scalable clinical deployment. This problem is particularly challenging due to heterogeneous data and privacy constraints that preven…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 12.8

DIET: Learning to Distill Dataset Continually for Recommender Systems

2026-03-26 · Jiaqing Zhang, Hao Wang, Mingjia Yin, Bo Chen, Qinglin Jia, Rui Zhou, Ruiming Tang, ChaoYi Ma, Enhong Chen

Research Track A · General AI

Modern deep recommender models are trained under a continual learning paradigm, relying on massive and continuously growing streaming behavioral logs. In large-scale platforms, retraining models on full historical data for architecture comparison or iteration is prohibitively expensive, severely slowing down model deve…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 12.5

STEM Agent: A Self-Adapting, Tool-Enabled, Extensible Architecture for Multi-Protocol AI Agent Systems

2026-03-22 · Alfred Shen, Aaron Shen

Research Track A · General AI

Current AI agent frameworks commit early to a single interaction protocol, a fixed tool integration strategy, and static user models, limiting their deployment across diverse interaction paradigms. To address these constraints, we introduce STEM Agent (Self-adapting, Tool-enabled, Extensible, Multi-agent), a modular ar…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 11.6

Beyond Benchmarks: How Users Evaluate AI Chat Assistants

2026-03-26 · Moiz Sadiq Awan, Muhammad Haris Noor, Muhammad Salman Munaf

Research Track A · General AI

Automated benchmarks dominate the evaluation of large language models, yet no systematic study has compared user satisfaction, adoption motivations, and frustrations across competing platforms using a consistent instrument. We address this gap with a cross-platform survey of 388 active AI chat users, comparing satisfac…

Review
pending
Role
unreviewed
Read
now
arxiv Score 11.5

Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning

2026-03-24 · Connor Mclaughlin, Nigel Lee, Lili Su

Research Track A

Machine learning models often need to adapt to new data after deployment due to structured or unstructured real-world dynamics. The Continual Learning (CL) framework enables continuous model adaptation, but most existing approaches either assume each task contains sufficiently many data samples or that the learning tas…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 11.0

RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots

2026-03-04 · Soroush Nasiriany, Sepehr Nasiriany, Abhiram Maddukuri, Yuke Zhu

Research Track A · General AI

Recent advances in robot learning have accelerated progress toward generalist robots that can perform everyday tasks in human environments. Yet it remains difficult to gauge how close we are to this vision. The field lacks a reproducible, large-scale benchmark for systematic evaluation. To fill this gap, we present Rob…

Review
pending
Role
unreviewed
Read
soon
arxiv Score 10.8

Cognitive Dark Matter: Measuring What AI Misses

2026-03-03 · Patrick J. Mineault, Thomas L. Griffiths, Sean Escola

Research Track A · General AI

We propose that the jagged intelligence landscape of modern AI systems arises from a missing training signal that we call "cognitive dark matter" (CDM): brain functions that meaningfully shape behavior yet are hard to infer from behavior alone. We identify key CDM domains-metacognition, cognitive flexibility, episodic …

Review
pending
Role
unreviewed
Read
soon
arxiv Score 9.8

GridVAD: Open-Set Video Anomaly Detection via Spatial Reasoning over Stratified Frame Grids

2026-03-26 · Mohamed Eltahir, Ahmed O. Ibrahim, Obada Siralkhatim, Tabarak Abdallah, Sondos Mohamed

Research Track A · General AI

Vision-Language Models (VLMs) are powerful open-set reasoners, yet their direct use as anomaly detectors in video surveillance is fragile: without calibrated anomaly priors, they alternate between missed detections and hallucinated false alarms. We argue the problem is not the VLM itself but how it is used. VLMs should…

Review
pending
Role
unreviewed
Read
soon