Bias-Robust Bayesian Optimization via Dueling Bandits

Abstract

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation.

Cite

Text

Kirschner and Krause. "Bias-Robust Bayesian Optimization via Dueling Bandits." International Conference on Machine Learning, 2021.

Markdown

[Kirschner and Krause. "Bias-Robust Bayesian Optimization via Dueling Bandits." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/kirschner2021icml-biasrobust/)

BibTeX

@inproceedings{kirschner2021icml-biasrobust,
  title     = {{Bias-Robust Bayesian Optimization via Dueling Bandits}},
  author    = {Kirschner, Johannes and Krause, Andreas},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {5595-5605},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/kirschner2021icml-biasrobust/}
}