Autonomous Exploration for Navigating in MDPs
Abstract
While intrinsically motivated learning agents hold considerable promise to overcome limitations of more supervised learning systems, quantitative evaluation and theoretical analysis of such agents are difficult. We propose to consider a restricted setting for autonomous learning where systematic evaluation of learning performance is possible. In this setting the agent needs to learn to navigate in a Markov Decision Process where extrinsic rewards are not present or are ignored. We present a learning algorithm for this scenario and evaluate it by the amount of exploration it uses to learn the environment.
Cite
Text
Lim and Auer. "Autonomous Exploration for Navigating in MDPs." Proceedings of the 25th Annual Conference on Learning Theory, 2012.Markdown
[Lim and Auer. "Autonomous Exploration for Navigating in MDPs." Proceedings of the 25th Annual Conference on Learning Theory, 2012.](https://mlanthology.org/colt/2012/lim2012colt-autonomous/)BibTeX
@inproceedings{lim2012colt-autonomous,
title = {{Autonomous Exploration for Navigating in MDPs}},
author = {Lim, Shiau Hong and Auer, Peter},
booktitle = {Proceedings of the 25th Annual Conference on Learning Theory},
year = {2012},
pages = {40.1-40.24},
volume = {23},
url = {https://mlanthology.org/colt/2012/lim2012colt-autonomous/}
}