How to Miss Data? Reinforcement Learning for Environments with High Observation Cost

Abstract

We consider a reinforcement learning (RL) setting where there is a cost associated with making accurate observations. We propose a reward shaping framework and present a self-tuning RL agent that learns to adjust the accuracy of the samples. We consider two different scenarios: In the first scenario, the agent directly varies the accuracy level of each sample. In the second scenario, the agent decides to perfectly observe some samples and miss others. In contrast to the existing work that focuses on sample efficiency during training, our focus is on the behavior of the agent when the observation cost is an intrinsic part of the environment. Our results illustrate that the RL agent can successfully learn that not all samples are equally informative and choose to observe the ones that are most critical for the task at hand with high accuracy.

Cite

Text

Koseoglu and Ozcelikkale. "How to Miss Data? Reinforcement Learning for Environments with High Observation Cost." ICML 2020 Workshops: Artemiss, 2020.

Markdown

[Koseoglu and Ozcelikkale. "How to Miss Data? Reinforcement Learning for Environments with High Observation Cost." ICML 2020 Workshops: Artemiss, 2020.](https://mlanthology.org/icmlw/2020/koseoglu2020icmlw-miss/)

BibTeX

@inproceedings{koseoglu2020icmlw-miss,
  title     = {{How to Miss Data? Reinforcement Learning for Environments with High Observation Cost}},
  author    = {Koseoglu, Mehmet and Ozcelikkale, Ayca},
  booktitle = {ICML 2020 Workshops: Artemiss},
  year      = {2020},
  url       = {https://mlanthology.org/icmlw/2020/koseoglu2020icmlw-miss/}
}