Exploiting History Data for Nonstationary Multi-Armed Bandit

Abstract

The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields. In the last years, the use of active approaches to tackle the nonstationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This paper presents a framework to integrate past information in the abruptly changing nonstationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework is based on well-known break-point prediction methods to correctly identify the instant the environment changed in the past, and on the definition of recurring concepts specifically for the MAB setting to reuse information from recurring MAB states, when necessary. We show that this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide an extensive experimental analysis on both synthetic and real-world data, showing the improvement provided by our framework.

Cite

Text

Re et al. "Exploiting History Data for Nonstationary Multi-Armed Bandit." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86486-6_4

Markdown

[Re et al. "Exploiting History Data for Nonstationary Multi-Armed Bandit." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/re2021ecmlpkdd-exploiting/) doi:10.1007/978-3-030-86486-6_4

BibTeX

@inproceedings{re2021ecmlpkdd-exploiting,
  title     = {{Exploiting History Data for Nonstationary Multi-Armed Bandit}},
  author    = {Re, Gerlando and Chiusano, Fabio and Trovò, Francesco and Carrera, Diego and Boracchi, Giacomo and Restelli, Marcello},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2021},
  pages     = {51-66},
  doi       = {10.1007/978-3-030-86486-6_4},
  url       = {https://mlanthology.org/ecmlpkdd/2021/re2021ecmlpkdd-exploiting/}
}