Aggregating Crowdsourced Ordinal Labels via Bayesian Clustering
Abstract
Crowdsourcing allows the collection of labels from a crowd of workers at low cost. In this paper, we focus on ordinal labels, whose underlying order is important. Crowdsourced labels can be noisy as there may be amateur workers, spammers and/or even malicious workers. Moreover, some workers/items may have very few labels, making the estimation of their behavior difficult. To alleviate these problems, we propose a novel Bayesian model that clusters workers and items together using the nonparametric Dirichlet process priors. This allows workers/items in the same cluster to borrow strength from each other. Instead of directly computing the posterior of this complex model, which is infeasible, we propose a new variational inference procedure. Experimental results on a number of real-world data sets show that the proposed algorithm is more accurate than the state-of-the-art, and is more robust to sparser labels.
Cite
Text
Guo and Kwok. "Aggregating Crowdsourced Ordinal Labels via Bayesian Clustering." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46128-1_27Markdown
[Guo and Kwok. "Aggregating Crowdsourced Ordinal Labels via Bayesian Clustering." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/guo2016ecmlpkdd-aggregating/) doi:10.1007/978-3-319-46128-1_27BibTeX
@inproceedings{guo2016ecmlpkdd-aggregating,
title = {{Aggregating Crowdsourced Ordinal Labels via Bayesian Clustering}},
author = {Guo, Xiawei and Kwok, James T.},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2016},
pages = {426-442},
doi = {10.1007/978-3-319-46128-1_27},
url = {https://mlanthology.org/ecmlpkdd/2016/guo2016ecmlpkdd-aggregating/}
}