Model Regularization for Stable Sample Rollouts

Abstract

When an imperfect model is used to generate sample rollouts, its errors tend to compound – a flawed sample is given as input to the model, which causes more errors, and so on. This presents a barrier to applying rollout-based plan-ning algorithms to learned models. To ad-dress this issue, a training methodology called “hallucinated replay ” is introduced, which adds samples from the model into the training data, thereby training the model to produce sensible predictions when its own samples are given as input. Capabilities and limitations of this ap-proach are studied empirically. In several exam-ples hallucinated replay allows effective planning with imperfect models while models trained us-ing only real experience fail dramatically. 1

Cite

Text

Talvitie. "Model Regularization for Stable Sample Rollouts." Conference on Uncertainty in Artificial Intelligence, 2014.

Markdown

[Talvitie. "Model Regularization for Stable Sample Rollouts." Conference on Uncertainty in Artificial Intelligence, 2014.](https://mlanthology.org/uai/2014/talvitie2014uai-model/)

BibTeX

@inproceedings{talvitie2014uai-model,
  title     = {{Model Regularization for Stable Sample Rollouts}},
  author    = {Talvitie, Erik},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2014},
  pages     = {780-789},
  url       = {https://mlanthology.org/uai/2014/talvitie2014uai-model/}
}