Variational Inference for Crowdsourcing

Abstract

Crowdsourcing has become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of aggregating the crowdsourced labels provided by a collection of unreliable annotators. We approach this problem by transforming it into a standard inference problem in graphical models, and applying approximate variational methods, including belief propagation (BP) and mean field (MF). We show that our BP algorithm generalizes both majority voting and a recent algorithm by Karger et al, while our MF method is closely related to a commonly used EM algorithm. In both cases, we find that the performance of the algorithms critically depends on the choice of a prior distribution on the workers' reliability; by choosing the prior properly, both BP and MF (and EM) perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.

Cite

Text

Liu et al. "Variational Inference for Crowdsourcing." Neural Information Processing Systems, 2012.

Markdown

[Liu et al. "Variational Inference for Crowdsourcing." Neural Information Processing Systems, 2012.](https://mlanthology.org/neurips/2012/liu2012neurips-variational/)

BibTeX

@inproceedings{liu2012neurips-variational,
  title     = {{Variational Inference for Crowdsourcing}},
  author    = {Liu, Qiang and Peng, Jian and Ihler, Alex},
  booktitle = {Neural Information Processing Systems},
  year      = {2012},
  pages     = {692-700},
  url       = {https://mlanthology.org/neurips/2012/liu2012neurips-variational/}
}