Exact Exponent in Optimal Rates for Crowdsourcing
Abstract
Crowdsourcing has become a popular tool for labeling large datasets. This paper studies the optimal error rate for aggregating crowdsourced labels provided by a collection of amateur workers. Under the Dawid-Skene probabilistic model, we establish matching upper and lower bounds with an exact exponent mI(\pi), where m is the number of workers and I(\pi) is the average Chernoff information that characterizes the workers’ collective ability. Such an exact characterization of the error exponent allows us to state a precise sample size requirement m \ge \frac1I(\pi)\log\frac1ε in order to achieve an εmisclassification error. In addition, our results imply optimality of various forms of EM algorithms given accurate initializers of the model parameters.
Cite
Text
Gao et al. "Exact Exponent in Optimal Rates for Crowdsourcing." International Conference on Machine Learning, 2016.Markdown
[Gao et al. "Exact Exponent in Optimal Rates for Crowdsourcing." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/gao2016icml-exact/)BibTeX
@inproceedings{gao2016icml-exact,
title = {{Exact Exponent in Optimal Rates for Crowdsourcing}},
author = {Gao, Chao and Lu, Yu and Zhou, Dengyong},
booktitle = {International Conference on Machine Learning},
year = {2016},
pages = {603-611},
volume = {48},
url = {https://mlanthology.org/icml/2016/gao2016icml-exact/}
}