Learning First-Order Markov Models for Control

NeurIPS 2004 pp. 1-8

/neurips/2004/abbeel2004neurips-learning/

Abstract

First-order Markov models have been successfully applied to many prob- lems, for example in modeling sequential data using Markov chains, and modeling control problems using the Markov decision processes (MDP) formalism. If a ﬁrst-order Markov model’s parameters are estimated from data, the standard maximum likelihood estimator considers only the ﬁrst-order (single-step) transitions. But for many problems, the ﬁrst- order conditional independence assumptions are not satisﬁed, and as a re- sult the higher order transition probabilities may be poorly approximated. Motivated by the problem of learning an MDP’s parameters for control, we propose an algorithm for learning a ﬁrst-order Markov model that ex- plicitly takes into account higher order interactions during training. Our algorithm uses an optimization criterion different from maximum likeli- hood, and allows us to learn models that capture longer range effects, but without giving up the beneﬁts of using ﬁrst-order Markov models. Our experimental results also show the new algorithm outperforming conven- tional maximum likelihood estimation in a number of control problems where the MDP’s parameters are estimated from data.

PDF NeurIPS Semantic Scholar

Cite

Text

Abbeel and Ng. "Learning First-Order Markov Models for Control." Neural Information Processing Systems, 2004.

Markdown

[Abbeel and Ng. "Learning First-Order Markov Models for Control." Neural Information Processing Systems, 2004.](https://mlanthology.org/neurips/2004/abbeel2004neurips-learning/)

BibTeX

@inproceedings{abbeel2004neurips-learning,
  title     = {{Learning First-Order Markov Models for Control}},
  author    = {Abbeel, Pieter and Ng, Andrew Y.},
  booktitle = {Neural Information Processing Systems},
  year      = {2004},
  pages     = {1-8},
  url       = {https://mlanthology.org/neurips/2004/abbeel2004neurips-learning/}
}