Adaptive Bases for Reinforcement Learning

Abstract

We consider the problem of reinforcement learning using function approximation, where the approximating basis can change dynamically while interacting with the environment. A motivation for such an approach is maximizing the value function fitness to the problem faced. Three errors are considered: approximation square error, Bellman residual, and projected Bellman residual. Algorithms under the actorcritic framework are presented, and shown to converge. The advantage of such an adaptive basis is demonstrated in simulations.

Cite

Text

Di Castro and Mannor. "Adaptive Bases for Reinforcement Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15880-3_26

Markdown

[Di Castro and Mannor. "Adaptive Bases for Reinforcement Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/castro2010ecmlpkdd-adaptive/) doi:10.1007/978-3-642-15880-3_26

BibTeX

@inproceedings{castro2010ecmlpkdd-adaptive,
  title     = {{Adaptive Bases for Reinforcement Learning}},
  author    = {Di Castro, Dotan and Mannor, Shie},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2010},
  pages     = {312-327},
  doi       = {10.1007/978-3-642-15880-3_26},
  url       = {https://mlanthology.org/ecmlpkdd/2010/castro2010ecmlpkdd-adaptive/}
}