Structure Learning in Ergodic Factored MDPs Without Knowledge of the Transition Function's In-Degree

Abstract

This paper introduces Learn Structure and Exploit RMax (LSE-RMax), a novel model based structure learning algorithm for ergodic factored-state MDPs. Given a planning horizon that satisfies a condition, LSE-RMax provably guarantees a return very close to the optimal return, with a high certainty, without requiring any prior knowledge of the in-degree of the transition function as input. LSE-RMax is fully implemented with a thorough analysis of its sample complexity. We also present empirical results demonstrating its effectiveness compared to prior approaches to the problem.

Cite

Text

Chakraborty and Stone. "Structure Learning in Ergodic Factored MDPs Without Knowledge of the Transition Function's In-Degree." International Conference on Machine Learning, 2011.

Markdown

[Chakraborty and Stone. "Structure Learning in Ergodic Factored MDPs Without Knowledge of the Transition Function's In-Degree." International Conference on Machine Learning, 2011.](https://mlanthology.org/icml/2011/chakraborty2011icml-structure/)

BibTeX

@inproceedings{chakraborty2011icml-structure,
  title     = {{Structure Learning in Ergodic Factored MDPs Without Knowledge of the Transition Function's In-Degree}},
  author    = {Chakraborty, Doran and Stone, Peter},
  booktitle = {International Conference on Machine Learning},
  year      = {2011},
  pages     = {737-744},
  url       = {https://mlanthology.org/icml/2011/chakraborty2011icml-structure/}
}