Exploring Unknown Environments with Real-Time Search or Reinforcement Learning
Abstract
Learning Real-Time A* (LRTA*) is a popular control method that interleaves plan(cid:173) ning and plan execution and has been shown to solve search problems in known environments efficiently. In this paper, we apply LRTA * to the problem of getting to a given goal location in an initially unknown environment. Uninformed LRTA * with maximal lookahead always moves on a shortest path to the closest unvisited state, that is, to the closest potential goal state. This was believed to be a good exploration heuristic, but we show that it does not minimize the worst-case plan-execution time compared to other uninformed exploration methods. This result is also of interest to reinforcement-learning researchers since many reinforcement learning methods use asynchronous dynamic programming, interleave planning and plan execution, and exhibit optimism in the face of uncertainty, just like LRTA *.
Cite
Text
Koenig. "Exploring Unknown Environments with Real-Time Search or Reinforcement Learning." Neural Information Processing Systems, 1998.Markdown
[Koenig. "Exploring Unknown Environments with Real-Time Search or Reinforcement Learning." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/koenig1998neurips-exploring/)BibTeX
@inproceedings{koenig1998neurips-exploring,
title = {{Exploring Unknown Environments with Real-Time Search or Reinforcement Learning}},
author = {Koenig, Sven},
booktitle = {Neural Information Processing Systems},
year = {1998},
pages = {1003-1009},
url = {https://mlanthology.org/neurips/1998/koenig1998neurips-exploring/}
}