Improving the Learning Efficiencies of Realtime Search

Abstract

The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. "-search (weightedrealtime search) allows suboptimal solutions with " error to reduce the total amount of learning performed. ffi -search (real- time search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, ffi -search can better control the tradeoff between exploration and exploitation. Introduction Existing search algorithms can be divided into two classes: offline search such as A* [Hart et...

Cite

Text

Ishida and Shimbo. "Improving the Learning Efficiencies of Realtime Search." AAAI Conference on Artificial Intelligence, 1996.

Markdown

[Ishida and Shimbo. "Improving the Learning Efficiencies of Realtime Search." AAAI Conference on Artificial Intelligence, 1996.](https://mlanthology.org/aaai/1996/ishida1996aaai-improving/)

BibTeX

@inproceedings{ishida1996aaai-improving,
  title     = {{Improving the Learning Efficiencies of Realtime Search}},
  author    = {Ishida, Toru and Shimbo, Masashi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1996},
  pages     = {305-310},
  url       = {https://mlanthology.org/aaai/1996/ishida1996aaai-improving/}
}