Combinatorial Pure Exploration for Dueling Bandit
Abstract
In this paper, we study combinatorial pure exploration for dueling bandits (CPE-DB): we have multiple candidates for multiple positions as modeled by a bipartite graph, and in each round we sample a duel of two candidates on one position and observe who wins in the duel, with the goal of finding the best candidate-position matching with high probability after multiple rounds of samples. CPE-DB is an adaptation of the original combinatorial pure exploration for multi-armed bandit (CPE-MAB) problem to the dueling bandit setting. We consider both the Borda winner and the Condorcet winner cases. For Borda winner, we establish a reduction of the problem to the original CPE-MAB setting and design PAC and exact algorithms that achieve both the sample complexity similar to that in the CPE-MAB setting (which is nearly optimal for a subclass of problems) and polynomial running time per round. For Condorcet winner, we first design a fully polynomial time approximation scheme (FPTAS) for the offline problem of finding the Condorcet winner with known winning probabilities, and then use the FPTAS as an oracle to design a novel pure exploration algorithm CAR-Cond with sample complexity analysis. CAR-Cond is the first algorithm with polynomial running time per round for identifying the Condorcet winner in CPE-DB.
Cite
Text
Chen et al. "Combinatorial Pure Exploration for Dueling Bandit." International Conference on Machine Learning, 2020.Markdown
[Chen et al. "Combinatorial Pure Exploration for Dueling Bandit." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/chen2020icml-combinatorial/)BibTeX
@inproceedings{chen2020icml-combinatorial,
title = {{Combinatorial Pure Exploration for Dueling Bandit}},
author = {Chen, Wei and Du, Yihan and Huang, Longbo and Zhao, Haoyu},
booktitle = {International Conference on Machine Learning},
year = {2020},
pages = {1531-1541},
volume = {119},
url = {https://mlanthology.org/icml/2020/chen2020icml-combinatorial/}
}