State Chrono Representation for Enhancing Generalization in Reinforcement Learning

Abstract

In reinforcement learning with image-based inputs, it is crucial to establish a robust and generalizable state representation. Recent advancements in metric learning, such as deep bisimulation metric approaches, have shown promising results in learning structured low-dimensional representation space from pixel observations, where the distance between states is measured based on task-relevant features. However, these approaches face challenges in demanding generalization tasks and scenarios with non-informative rewards. This is because they fail to capture sufficient long-term information in the learned representations. To address these challenges, we propose a novel State Chrono Representation (SCR) approach. SCR augments state metric-based representations by incorporating extensive temporal information into the update step of bisimulation metric learning. It learns state distances within a temporal framework that considers both future dynamics and cumulative rewards over current and long-term future states. Our learning strategy effectively incorporates future behavioral information into the representation space without introducing a significant number of additional parameters for modeling dynamics. Extensive experiments conducted in DeepMind Control and Meta-World environments demonstrate that SCR achieves better performance comparing to other recent metric-based methods in demanding generalization tasks. The codes of SCR are available in https://github.com/jianda-chen/SCR.

Cite

Text

Chen et al. "State Chrono Representation for Enhancing Generalization in Reinforcement Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-2332

Markdown

[Chen et al. "State Chrono Representation for Enhancing Generalization in Reinforcement Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/chen2024neurips-state/) doi:10.52202/079017-2332

BibTeX

@inproceedings{chen2024neurips-state,
  title     = {{State Chrono Representation for Enhancing Generalization in Reinforcement Learning}},
  author    = {Chen, Jianda and Ng, Wen Zheng Terence and Chen, Zichen and Pan, Sinno Jialin and Zhang, Tianwei},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2332},
  url       = {https://mlanthology.org/neurips/2024/chen2024neurips-state/}
}