An Adaptive Architecture for Modular Q-Learning

Abstract

Reinforcement learning is a technique to learn suitable action policies that maximize utility, via the clue of reinforcement signals: reward or punishment. Q-learning, a widely used reinforcement learning method, has been analyzed in much research on autonomous agents. However, as the size of the problem space increases, agents need more computational resources and require more time to learn appropriate policies. Whitehead proposed an architecture called modular Q-learning, that decomposes the whole problem space into smaller subproblem spaces, and distributes them among multiple modules. Thus, each module takes charge of part of the whole problem. In modular Q-learning, however, human designers have to decompose the problem space, and create a suitable set of modules manually. Agents with such a fixed module architecture cannot adapt themselves to dynamic environments. Here, we propose a new architecture for reinforcement learning called AMQL (Automatic Modular Q-Learning), that enables agents to obtain a suitable set of modules by themselves using a selection method. Through experiments, we show that agents can automatically obtain suitable modules to gain a reward. Furthermore, we show that agents can adapt themselves to dynamic environments efficiently, through reconstructing modules. 1

Cite

Text

Kohri et al. "An Adaptive Architecture for Modular Q-Learning." International Joint Conference on Artificial Intelligence, 1997.

Markdown

[Kohri et al. "An Adaptive Architecture for Modular Q-Learning." International Joint Conference on Artificial Intelligence, 1997.](https://mlanthology.org/ijcai/1997/kohri1997ijcai-adaptive/)

BibTeX

@inproceedings{kohri1997ijcai-adaptive,
  title     = {{An Adaptive Architecture for Modular Q-Learning}},
  author    = {Kohri, Takayuki and Matsubayashi, Kei and Tokoro, Mario},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1997},
  pages     = {820-825},
  url       = {https://mlanthology.org/ijcai/1997/kohri1997ijcai-adaptive/}
}