Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Abstract
Crowdsourcing has emerged as a core component of data science pipelines. From collected noisy worker labels, aggregation models that incorporate worker reliability parameters aim to infer a latent true annotation. In this paper, we argue that existing crowdsourcing approaches do not sufficiently model worker correlations observed in practical settings; we propose in response an enhanced Bayesian classifier combination (EBCC) model, with inference based on a mean-field variational approach. An introduced mixture of intra-class reliabilities—connected to tensor decomposition and item clustering—induces inter-worker correlation. EBCC does not suffer the limitations of existing correlation models: intractable marginalisation of missing labels and poor scaling to large worker cohorts. Extensive empirical comparison on 17 real-world datasets sees EBCC achieving the highest mean accuracy across 10 benchmark crowdsourcing methods.
Cite
Text
Li et al. "Exploiting Worker Correlation for Label Aggregation in Crowdsourcing." International Conference on Machine Learning, 2019.Markdown
[Li et al. "Exploiting Worker Correlation for Label Aggregation in Crowdsourcing." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/li2019icml-exploiting/)BibTeX
@inproceedings{li2019icml-exploiting,
title = {{Exploiting Worker Correlation for Label Aggregation in Crowdsourcing}},
author = {Li, Yuan and Rubinstein, Benjamin and Cohn, Trevor},
booktitle = {International Conference on Machine Learning},
year = {2019},
pages = {3886-3895},
volume = {97},
url = {https://mlanthology.org/icml/2019/li2019icml-exploiting/}
}