Online Semi-Supervised Learning with Bandit Feedback

Abstract

We formulate a new problem at the intersection of semi-supervised learning and contextual bandits, motivated by several applications including clinical trials and dialog systems. We demonstrate how contextual bandit and graph convolutional networks can be adjusted to the new problem formulation. We then take the best of both approaches to develop multi-GCN embedded contextual bandit. Our algorithms are verified on several real world datasets.

Cite

Text

Yurochkin et al. "Online Semi-Supervised Learning with Bandit Feedback." ICLR 2019 Workshops: LLD, 2019.

Markdown

[Yurochkin et al. "Online Semi-Supervised Learning with Bandit Feedback." ICLR 2019 Workshops: LLD, 2019.](https://mlanthology.org/iclrw/2019/yurochkin2019iclrw-online/)

BibTeX

@inproceedings{yurochkin2019iclrw-online,
  title     = {{Online Semi-Supervised Learning with Bandit Feedback}},
  author    = {Yurochkin, Mikhail and Upadhyay, Sohini and Bouneffouf, Djallel and Agarwal, Mayank and Khazaeni, Yasaman},
  booktitle = {ICLR 2019 Workshops: LLD},
  year      = {2019},
  url       = {https://mlanthology.org/iclrw/2019/yurochkin2019iclrw-online/}
}