Localizing Search in Reinforcement Learning

Grudic, Gregory Z.; Ungar, Lyle H.

Localizing Search in Reinforcement Learning

AAAI 2000 pp. 590-595

/aaai/2000/grudic2000aaai-localizing/

Abstract

Reinforcement learning (RL) can be impractical for many high dimensional problems because of the com-putational cost of doing stochastic search in large state spaces. We propose a new RL method, Boundary Lo-calized Reinforcement Learning (BLRL), which maps RL into a mode switching problem where an agent de-terministically chooses an action based on its state, and limits stochastic search to small areas around mode boundaries, drastically reducing computational cost. BLRL starts with an initial set of parameterized bound-aries that partition the state space into distinct control modes. Reinforcement reward is used to update the boundary parameters using the policy gradient formu-lation of Sutton et al. (2000). We demonstrate that stochastic search can be limited to regions near mode boundaries, thus greatly reducing search, while still guaranteeing convergence to a locally optimal deter-ministic mode switching policy. Further, we give con-ditions under which the policy gradient can be arbitrar-ily well approximated without the use of any stochastic search. These theoretical results are supported experi-mentally via simulation.

PDF AAAI Semantic Scholar

Cite

Text

Grudic and Ungar. "Localizing Search in Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2000.

Markdown

[Grudic and Ungar. "Localizing Search in Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2000.](https://mlanthology.org/aaai/2000/grudic2000aaai-localizing/)

BibTeX

@inproceedings{grudic2000aaai-localizing,
  title     = {{Localizing Search in Reinforcement Learning}},
  author    = {Grudic, Gregory Z. and Ungar, Lyle H.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2000},
  pages     = {590-595},
  url       = {https://mlanthology.org/aaai/2000/grudic2000aaai-localizing/}
}