Learning Multiple Models for Reward Maximization

Abstract

We present an approach to reward maximization in a non-stationary mobile robot environment. The approach works within the realistic constraints of limited local sensing and limited a priori knowledge of the environment. It is based on the use of augmented Markov models (AMMs), a general modeling tool we have developed. AMMs are essentially Markov chains having additional statistics associated with states and state transitions. We have developed an algorithm that constructs AMMs on-line and in real-time with little computational and space overhead, making it practical to learn multiple models of the interaction dynamics between a robot and its environment during the execution of a task. For the purposes of reward maximization in a non-stationary environment, these models monitor events at increasing intervals of time and provide statistics used to discard redundant or outdated information while reducing the probability of conforming to noise. We have successfully i...

Cite

Text

Goldberg and Mataric. "Learning Multiple Models for Reward Maximization." International Conference on Machine Learning, 2000.

Markdown

[Goldberg and Mataric. "Learning Multiple Models for Reward Maximization." International Conference on Machine Learning, 2000.](https://mlanthology.org/icml/2000/goldberg2000icml-learning/)

BibTeX

@inproceedings{goldberg2000icml-learning,
  title     = {{Learning Multiple Models for Reward Maximization}},
  author    = {Goldberg, Dani and Mataric, Maja J.},
  booktitle = {International Conference on Machine Learning},
  year      = {2000},
  pages     = {319-326},
  url       = {https://mlanthology.org/icml/2000/goldberg2000icml-learning/}
}