Heuristic Selection of Actions in Multiagent Reinforcement Learning

Abstract

This work presents a new algorithm, called Heuristically Accelerated Minimax-Q (HAMMQ), that allows the use of heuristics to speed up the well-known Multiagent Reinforcement Learning algorithm Minimax-Q. A heuristic function that influences the choice of the actions characterises the HAMMQ algorithm. This function is associated with a preference policy that indicates that a certain action must be taken instead of another. A set of empirical evaluations were conducted for the proposed algorithm in a simplified simulator for the robot soccer domain, and experimental results show that even very simple heuristics enhances significantly the performance of the multiagent reinforcement learning algorithm.

Cite

Text

Bianchi et al. "Heuristic Selection of Actions in Multiagent Reinforcement Learning." International Joint Conference on Artificial Intelligence, 2007.

Markdown

[Bianchi et al. "Heuristic Selection of Actions in Multiagent Reinforcement Learning." International Joint Conference on Artificial Intelligence, 2007.](https://mlanthology.org/ijcai/2007/bianchi2007ijcai-heuristic/)

BibTeX

@inproceedings{bianchi2007ijcai-heuristic,
  title     = {{Heuristic Selection of Actions in Multiagent Reinforcement Learning}},
  author    = {Bianchi, Reinaldo A. C. and Ribeiro, Carlos H. C. and Costa, Anna Helena Reali},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {690-695},
  url       = {https://mlanthology.org/ijcai/2007/bianchi2007ijcai-heuristic/}
}