$\pi$2vec: Policy Representation with Successor Features

Abstract

This paper introduces $\pi$2vec, a method for representing black box policies as comparable feature vectors. Our method combines the strengths of foundation models that serve as generic and powerful state representations and successor features that can model the future occurrence of the states for a policy. $\pi$2vec represents the behavior of policies by capturing the statistics of the features from a pretrained model with the help of successor feature framework. We focus on the offline setting where policies and their representations are trained on a fixed dataset of trajectories. Finally, we employ linear regression on $\pi$2vec vector representations to predict the performance of held out policies. The synergy of these techniques results in a method for efficient policy evaluation in resource constrained environments.

Cite

Text

Scarpellini et al. "$\pi$2vec: Policy Representation with Successor Features." International Conference on Learning Representations, 2024.

Markdown

[Scarpellini et al. "$\pi$2vec: Policy Representation with Successor Features." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/scarpellini2024iclr-2vec/)

BibTeX

@inproceedings{scarpellini2024iclr-2vec,
  title     = {{$\pi$2vec: Policy Representation with Successor Features}},
  author    = {Scarpellini, Gianluca and Konyushkova, Ksenia and Fantacci, Claudio and Paine, Thomas and Chen, Yutian and Denil, Misha},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/scarpellini2024iclr-2vec/}
}