Learning to Achieve Goals
Abstract
Temporal difference methods solve the temporal credit assignment problem for reinforcement learning. An important subproblem of general reinforcement learning is learning to achieve dynamic goals. Although existing temporal difference methods, such as Q learning, can be applied to this problem, they do not take advantage of its special structure. This paper presents the DG-learning algorithm, which learns efficiently to achieve dynamically changing goals and exhibits good knowledge transfer between goals. In addition, this paper shows how traditional relaxation techniques can be applied to the problem. Finally, experimental results are given that demonstrate the superiority of DG learning over Q learning in a moderately large, synthetic, non-deterministic domain.
Cite
Text
Kaelbling. "Learning to Achieve Goals." International Joint Conference on Artificial Intelligence, 1993.Markdown
[Kaelbling. "Learning to Achieve Goals." International Joint Conference on Artificial Intelligence, 1993.](https://mlanthology.org/ijcai/1993/kaelbling1993ijcai-learning/)BibTeX
@inproceedings{kaelbling1993ijcai-learning,
title = {{Learning to Achieve Goals}},
author = {Kaelbling, Leslie Pack},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1993},
pages = {1094-1099},
url = {https://mlanthology.org/ijcai/1993/kaelbling1993ijcai-learning/}
}