Learning to Predict from Crowdsourced Data
Abstract
Crowdsourcing services like Amazon's Mechanical Turk have facilitated and greatly expedited the manual labeling process from a large number of human workers. However, spammers are often unavoidable and the crowdsourced labels can be very noisy. In this paper, we explicitly account for four sources for a noisy crowdsourced label: worker's dedication to the task, his/her expertise, his/her default labeling judgement, and sample difficulty. A novel mixture model is employed for worker annotations, which learns a prediction model directly from samples to labels for efficient out-of-sample testing. Experiments on both simulated and real-world crowdsourced data sets show that the proposed method achieves significant improvements over the state-of-the-art.
Cite
Text
Bi et al. "Learning to Predict from Crowdsourced Data." Conference on Uncertainty in Artificial Intelligence, 2014.Markdown
[Bi et al. "Learning to Predict from Crowdsourced Data." Conference on Uncertainty in Artificial Intelligence, 2014.](https://mlanthology.org/uai/2014/bi2014uai-learning/)BibTeX
@inproceedings{bi2014uai-learning,
title = {{Learning to Predict from Crowdsourced Data}},
author = {Bi, Wei and Wang, Liwei and Kwok, James T. and Tu, Zhuowen},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2014},
pages = {82-91},
url = {https://mlanthology.org/uai/2014/bi2014uai-learning/}
}