Pure Exploration of Multi-Armed Bandits with Heavy-Tailed Payoffs
Abstract
Inspired by heavy-tailed distributions in practical scenarios, we investigate the problem on pure exploration of Multi-Armed Bandits (MAB) with heavy-tailed payoffs by breaking the assumption of payoffs with sub-Gaussian noises in MAB, and assuming that stochastic payoffs from bandits are with finite $p$-th moments, where $p\in (1,+\infty)$. The main contributions in this paper are three-fold. First, we technically analyze tail probabilities of empirical average and truncated empirical average (TEA) for estimating expected payoffs in sequential decisions with heavy-tailed noises via martingales. Second, we propose two effective bandit algorithms based on different prior information (i.e., fixed confidence or fixed budget) for pure exploration of MAB generating payoffs with finite $p$-th moments. Third, we derive theoretical guarantees for the proposed two bandit algorithms, and demonstrate the effectiveness of two algorithms in pure exploration of MAB with heavy-tailed payoffs in synthetic data and real-world financial data.
Cite
Text
Yu et al. "Pure Exploration of Multi-Armed Bandits with Heavy-Tailed Payoffs." Conference on Uncertainty in Artificial Intelligence, 2018.Markdown
[Yu et al. "Pure Exploration of Multi-Armed Bandits with Heavy-Tailed Payoffs." Conference on Uncertainty in Artificial Intelligence, 2018.](https://mlanthology.org/uai/2018/yu2018uai-pure/)BibTeX
@inproceedings{yu2018uai-pure,
title = {{Pure Exploration of Multi-Armed Bandits with Heavy-Tailed Payoffs}},
author = {Yu, Xiaotian and Shao, Han and Lyu, Michael R. and King, Irwin},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2018},
pages = {937-946},
url = {https://mlanthology.org/uai/2018/yu2018uai-pure/}
}