Identifying Outlier Arms in Multi-Armed Bandit

Abstract

We study a novel problem lying at the intersection of two areas: multi-armed bandit and outlier detection. Multi-armed bandit is a useful tool to model the process of incrementally collecting data for multiple objects in a decision space. Outlier detection is a powerful method to narrow down the attention to a few objects after the data for them are collected. However, no one has studied how to detect outlier objects while incrementally collecting data for them, which is necessary when data collection is expensive. We formalize this problem as identifying outlier arms in a multi-armed bandit. We propose two sampling strategies with theoretical guarantee, and analyze their sampling efficiency. Our experimental results on both synthetic and real data show that our solution saves 70-99% of data collection cost from baseline while having nearly perfect accuracy.

Cite

Text

Zhuang et al. "Identifying Outlier Arms in Multi-Armed Bandit." Neural Information Processing Systems, 2017.

Markdown

[Zhuang et al. "Identifying Outlier Arms in Multi-Armed Bandit." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/zhuang2017neurips-identifying/)

BibTeX

@inproceedings{zhuang2017neurips-identifying,
  title     = {{Identifying Outlier Arms in Multi-Armed Bandit}},
  author    = {Zhuang, Honglei and Wang, Chi and Wang, Yifan},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {5204-5213},
  url       = {https://mlanthology.org/neurips/2017/zhuang2017neurips-identifying/}
}