Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach

Abstract

We study the problem of online rank elicitation, assuming that rankings of a set of alternatives obey the Plackett-Luce distribution. Following the setting of the dueling bandits problem, the learner is allowed to query pairwise comparisons between alternatives, i.e., to sample pairwise marginals of the distribution in an online fashion. Using this information, the learner seeks to reliably predict the most probable ranking (or top-alternative). Our approach is based on constructing a surrogate probability distribution over rankings based on a sorting procedure, for which the pairwise marginals provably coincide with the marginals of the Plackett-Luce distribution. In addition to a formal performance and complexity analysis, we present first experimental studies.

Cite

Text

Szörényi et al. "Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach." Neural Information Processing Systems, 2015.

Markdown

[Szörényi et al. "Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach." Neural Information Processing Systems, 2015.](https://mlanthology.org/neurips/2015/szorenyi2015neurips-online/)

BibTeX

@inproceedings{szorenyi2015neurips-online,
  title     = {{Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach}},
  author    = {Szörényi, Balázs and Busa-Fekete, Róbert and Paul, Adil and Hüllermeier, Eyke},
  booktitle = {Neural Information Processing Systems},
  year      = {2015},
  pages     = {604-612},
  url       = {https://mlanthology.org/neurips/2015/szorenyi2015neurips-online/}
}