Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

Abstract

While goal-conditioned behavior cloning (GCBC) methods can perform well on in-distribution training tasks, they do not necessarily generalize zero-shot to tasks that require conditioning on novel state-goal pairs, i.e. combinatorial generalization. In part, this limitation can be attributed to a lack of temporal consistency in the state representation learned by BC; if temporally correlated states are properly encoded to similar latent representations, then the out-of-distribution gap for novel state-goal pairs would be reduced. We formalize this notion by demonstrating how encouraging long-range temporal consistency via successor representations (SR) can facilitate generalization. We then propose a simple yet effective representation learning objective, $\text{BYOL-}\gamma$ for GCBC, which theoretically approximates the successor representation in the finite MDP case through self-predictive representations, and achieves competitive empirical performance across a suite of challenging tasks requiring combinatorial generalization.

Cite

Text

Lawson et al. "Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning." International Conference on Learning Representations, 2026.

Markdown

[Lawson et al. "Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/lawson2026iclr-selfpredictive/)

BibTeX

@inproceedings{lawson2026iclr-selfpredictive,
  title     = {{Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning}},
  author    = {Lawson, Daniel and Hugessen, Adriana and Cloutier, Charlotte and Berseth, Glen and Khetarpal, Khimya},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/lawson2026iclr-selfpredictive/}
}