On the Value of Prior in Online Learning to Rank
Abstract
This paper addresses the cold-start problem in online learning to rank (OLTR). We show both theoretically and empirically that priors improve the quality of ranked lists presented to users interactively based on user feedback. These priors can come in the form of unbiased estimates of the relevance of the ranked items, or more practically, can be obtained from offline-learned models. Our experiments show the effectiveness of priors in improving the short-term regret of tabular OLTR algorithms, based on Thompson sampling and BayesUCB.
Cite
Text
Kveton et al. "On the Value of Prior in Online Learning to Rank." Artificial Intelligence and Statistics, 2022.Markdown
[Kveton et al. "On the Value of Prior in Online Learning to Rank." Artificial Intelligence and Statistics, 2022.](https://mlanthology.org/aistats/2022/kveton2022aistats-value/)BibTeX
@inproceedings{kveton2022aistats-value,
title = {{On the Value of Prior in Online Learning to Rank}},
author = {Kveton, Branislav and Meshi, Ofer and Zoghi, Masrour and Qin, Zhen},
booktitle = {Artificial Intelligence and Statistics},
year = {2022},
pages = {6880-6892},
volume = {151},
url = {https://mlanthology.org/aistats/2022/kveton2022aistats-value/}
}