P3VI: A Partitioned, Prioritized, Parallel Value Iterator

Abstract

We present an examination of the state-of-the-art for using value iteration to solve large-scale discrete Markov Decision Processes. We introduce an architecture which combines three independent performance enhancements (the intelligent prioritization of computation, state partitioning, and massively parallel processing) into a single algorithm. We show that each idea improves performance in a different way, meaning that algorithm designers do not have to trade one improvement for another. We give special attention to parallelization issues, discussing how to efficiently partition states, distribute partitions to processors, minimize message passing and ensure high scalability. We present experimental results which demonstrate that this approach solves large problems in reasonable time.

Cite

Text

Wingate and Seppi. "P3VI: A Partitioned, Prioritized, Parallel Value Iterator." International Conference on Machine Learning, 2004. doi:10.1145/1015330.1015440

Markdown

[Wingate and Seppi. "P3VI: A Partitioned, Prioritized, Parallel Value Iterator." International Conference on Machine Learning, 2004.](https://mlanthology.org/icml/2004/wingate2004icml-p/) doi:10.1145/1015330.1015440

BibTeX

@inproceedings{wingate2004icml-p,
  title     = {{P3VI: A Partitioned, Prioritized, Parallel Value Iterator}},
  author    = {Wingate, David and Seppi, Kevin D.},
  booktitle = {International Conference on Machine Learning},
  year      = {2004},
  doi       = {10.1145/1015330.1015440},
  url       = {https://mlanthology.org/icml/2004/wingate2004icml-p/}
}