Adversarial Goal Generation for Intrinsic Motivation

Abstract

Generally in Reinforcement Learning the goal, or reward signal, is given by the environment and cannot be controlled by the agent. We propose to introduce an intrinsic motivation module that will select a reward function for the agent to learn to achieve. We will use a Universal Value Function Approximator, that takes as input both the state and the parameters of this reward function as the goal to predict the value function (or action-value function) to generalize across these goals. This module will be trained to generate goals such that the agent's learning is maximized. Thus, this is also a method for automatic curriculum learning.

Cite

Text

Durugkar and Stone. "Adversarial Goal Generation for Intrinsic Motivation." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12195

Markdown

[Durugkar and Stone. "Adversarial Goal Generation for Intrinsic Motivation." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/durugkar2018aaai-adversarial/) doi:10.1609/AAAI.V32I1.12195

BibTeX

@inproceedings{durugkar2018aaai-adversarial,
  title     = {{Adversarial Goal Generation for Intrinsic Motivation}},
  author    = {Durugkar, Ishan and Stone, Peter},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {8073-8074},
  doi       = {10.1609/AAAI.V32I1.12195},
  url       = {https://mlanthology.org/aaai/2018/durugkar2018aaai-adversarial/}
}