Paper Detail

Debiased Model-based Representations for Sample-efficient Continuous Control

Jiafei Lyu, Zichuan Lin, Scott Fujimoto, Kai Yang, Yangkun Chen, Saiyong Yang, Zongqing Lu, Deheng Ye

huggingface Score 9.5

Published 2026-05-12 · First seen 2026-05-13

General AI

Abstract

Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training costs associated with model-based methods. Nevertheless, existing model-based representation methods can fail to capture sufficient information about relevant variables and can overfit to early experiences in the replay buffer. These incur biases in representation and actor-critic learning, leading to inferior performance. To address this, we propose Debiased model-based Representations for Q-learning, tagged DR.Q algorithm. DR.Q explicitly maximizes the mutual information between the representations of the current state-action pair and the next state besides minimizing their deviations, and samples transitions with faded prioritized experience replay. We evaluate DR.Q on numerous continuous control benchmarks with a single set of hyperparameters, and the results demonstrate that DR.Q can match or surpass recent strong baselines, sometimes outperforming them by a large margin. Our code is available at https://github.com/dmksjfl/DR.Q.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{lyu2026debiased,
  title = {Debiased Model-based Representations for Sample-efficient Continuous Control},
  author = {Jiafei Lyu and Zichuan Lin and Scott Fujimoto and Kai Yang and Yangkun Chen and Saiyong Yang and Zongqing Lu and Deheng Ye},
  year = {2026},
  abstract = {Model-based representations recently stand out as a promising framework that embeds latent dynamics information into the representations for downstream off-policy actor-critic learning. It implicitly combines the advantages of both model-free and model-based approaches while avoiding the training costs associated with model-based methods. Nevertheless, existing model-based representation methods can fail to capture sufficient information about relevant variables and can overfit to early experien},
  url = {https://huggingface.co/papers/2605.11711},
  keywords = {model-based representations, latent dynamics information, off-policy actor-critic learning, model-free approaches, model-based approaches, replay buffer, mutual information, faded prioritized experience replay, Q-learning, representation learning, code available, huggingface daily},
  eprint = {2605.11711},
  archiveprefix = {arXiv},
}

Metadata

{}