Provably Efficient Third-Person Imitation from Offline Observation
Abstract
Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.
Cite
Text
Zweig and Bruna. "Provably Efficient Third-Person Imitation from Offline Observation." Uncertainty in Artificial Intelligence, 2020.Markdown
[Zweig and Bruna. "Provably Efficient Third-Person Imitation from Offline Observation." Uncertainty in Artificial Intelligence, 2020.](https://mlanthology.org/uai/2020/zweig2020uai-provably/)BibTeX
@inproceedings{zweig2020uai-provably,
title = {{Provably Efficient Third-Person Imitation from Offline Observation}},
author = {Zweig, Aaron and Bruna, Joan},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2020},
pages = {1228-1237},
volume = {124},
url = {https://mlanthology.org/uai/2020/zweig2020uai-provably/}
}