Planning with an Adaptive World Model

Abstract

We present a new connectionist planning method [TML90]. By interaction with an unknown environment, a world model is progressively construc(cid:173) ted using gradient descent. For deriving optimal actions with respect to future reinforcement, planning is applied in two steps: an experience net(cid:173) work proposes a plan which is subsequently optimized by gradient descent with a chain of world models, so that an optimal reinforcement may be obtained when it is actually run. The appropriateness of this method is demonstrated by a robotics application and a pole balancing task.

Cite

Text

Thrun et al. "Planning with an Adaptive World Model." Neural Information Processing Systems, 1990.

Markdown

[Thrun et al. "Planning with an Adaptive World Model." Neural Information Processing Systems, 1990.](https://mlanthology.org/neurips/1990/thrun1990neurips-planning/)

BibTeX

@inproceedings{thrun1990neurips-planning,
  title     = {{Planning with an Adaptive World Model}},
  author    = {Thrun, Sebastian and Möller, Knut and Linden, Alexander},
  booktitle = {Neural Information Processing Systems},
  year      = {1990},
  pages     = {450-456},
  url       = {https://mlanthology.org/neurips/1990/thrun1990neurips-planning/}
}