Exploring Unknown Environments with Real-Time Search or Reinforcement Learning

Abstract

Learning Real-Time A* (LRTA*) is a popular control method that interleaves plan(cid:173) ning and plan execution and has been shown to solve search problems in known environments efficiently. In this paper, we apply LRTA * to the problem of getting to a given goal location in an initially unknown environment. Uninformed LRTA * with maximal lookahead always moves on a shortest path to the closest unvisited state, that is, to the closest potential goal state. This was believed to be a good exploration heuristic, but we show that it does not minimize the worst-case plan-execution time compared to other uninformed exploration methods. This result is also of interest to reinforcement-learning researchers since many reinforcement learning methods use asynchronous dynamic programming, interleave planning and plan execution, and exhibit optimism in the face of uncertainty, just like LRTA *.

Cite

Text

Koenig. "Exploring Unknown Environments with Real-Time Search or Reinforcement Learning." Neural Information Processing Systems, 1998.

Markdown

[Koenig. "Exploring Unknown Environments with Real-Time Search or Reinforcement Learning." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/koenig1998neurips-exploring/)

BibTeX

@inproceedings{koenig1998neurips-exploring,
  title     = {{Exploring Unknown Environments with Real-Time Search or Reinforcement Learning}},
  author    = {Koenig, Sven},
  booktitle = {Neural Information Processing Systems},
  year      = {1998},
  pages     = {1003-1009},
  url       = {https://mlanthology.org/neurips/1998/koenig1998neurips-exploring/}
}