Provably Efficient Third-Person Imitation from Offline Observation

Abstract

Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.

Cite

Text

Zweig and Bruna. "Provably Efficient Third-Person Imitation from Offline Observation." Uncertainty in Artificial Intelligence, 2020.

Markdown

[Zweig and Bruna. "Provably Efficient Third-Person Imitation from Offline Observation." Uncertainty in Artificial Intelligence, 2020.](https://mlanthology.org/uai/2020/zweig2020uai-provably/)

BibTeX

@inproceedings{zweig2020uai-provably,
  title     = {{Provably Efficient Third-Person Imitation from Offline Observation}},
  author    = {Zweig, Aaron and Bruna, Joan},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2020},
  pages     = {1228-1237},
  volume    = {124},
  url       = {https://mlanthology.org/uai/2020/zweig2020uai-provably/}
}