Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes

Furmston, Thomas; Barber, David

doi:10.1007/978-3-642-23780-5_41

Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes

Thomas Furmston, David Barber

ECML-PKDD 2011 pp. 487-502

doi:10.1007/978-3-642-23780-5_41 /ecmlpkdd/2011/furmston2011ecmlpkdd-lagrange/

Abstract

Solving finite-horizon Markov Decision Processes with stationary policies is a computationally difficult problem. Our dynamic dual decomposition approach uses Lagrange duality to decouple this hard problem into a sequence of tractable sub-problems. The resulting procedure is a straightforward modification of standard non-stationary Markov Decision Process solvers and gives an upper-bound on the total expected reward. The empirical performance of the method suggests that not only is it a rapidly convergent algorithm, but that it also performs favourably compared to standard planning algorithms such as policy gradients and lower-bound procedures such as Expectation Maximisation.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Furmston and Barber. "Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011. doi:10.1007/978-3-642-23780-5_41

Markdown

[Furmston and Barber. "Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011.](https://mlanthology.org/ecmlpkdd/2011/furmston2011ecmlpkdd-lagrange/) doi:10.1007/978-3-642-23780-5_41

BibTeX

@inproceedings{furmston2011ecmlpkdd-lagrange,
  title     = {{Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes}},
  author    = {Furmston, Thomas and Barber, David},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2011},
  pages     = {487-502},
  doi       = {10.1007/978-3-642-23780-5_41},
  url       = {https://mlanthology.org/ecmlpkdd/2011/furmston2011ecmlpkdd-lagrange/}
}