Hypervolume Indicator and Dominance Reward Based Multi-Objective Monte-Carlo Tree Search

Wang, Weijia; Sebag, Michèle

doi:10.1007/S10994-013-5369-0

Hypervolume Indicator and Dominance Reward Based Multi-Objective Monte-Carlo Tree Search

Weijia Wang, Michèle Sebag

MLJ 2013 pp. 403-429

doi:10.1007/S10994-013-5369-0 /mlj/2013/wang2013mlj-hypervolume/

Abstract

Concerned with multi-objective reinforcement learning (MORL), this paper presents MOMCTS, an extension of Monte-Carlo Tree Search to multi-objective sequential decision making, embedding two decision rules respectively based on the hypervolume indicator and the Pareto dominance reward. The MOMCTS approaches are firstly compared with the MORL state of the art on two artificial problems, the two-objective Deep Sea Treasure problem and the three-objective Resource Gathering problem. The scalability of MOMCTS is also examined in the context of the NP-hard grid scheduling problem, showing that the MOMCTS performance matches the (non-RL based) state of the art albeit with a higher computational cost.

PDF MLJ Semantic Scholar

Cite

Text

Wang and Sebag. "Hypervolume Indicator and Dominance Reward Based Multi-Objective Monte-Carlo Tree Search." Machine Learning, 2013. doi:10.1007/S10994-013-5369-0

Markdown

[Wang and Sebag. "Hypervolume Indicator and Dominance Reward Based Multi-Objective Monte-Carlo Tree Search." Machine Learning, 2013.](https://mlanthology.org/mlj/2013/wang2013mlj-hypervolume/) doi:10.1007/S10994-013-5369-0

BibTeX

@article{wang2013mlj-hypervolume,
  title     = {{Hypervolume Indicator and Dominance Reward Based Multi-Objective Monte-Carlo Tree Search}},
  author    = {Wang, Weijia and Sebag, Michèle},
  journal   = {Machine Learning},
  year      = {2013},
  pages     = {403-429},
  doi       = {10.1007/S10994-013-5369-0},
  volume    = {92},
  url       = {https://mlanthology.org/mlj/2013/wang2013mlj-hypervolume/}
}