Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing

Abstract

Crowdsourcing systems make it possible to hire voluntary workers to label large-scale data by offering them small monetary payments. Usually, the taskmaster requires to collect high-quality labels, while the quality of labels obtained from the crowd may not satisfy this requirement. In this paper, we study the problem of obtaining high-quality labels from the crowd and present an approach of learning the difficulty of items in crowdsourcing, in which we construct a small training set of items with estimated difficulty and then learn a model to predict the difficulty of future items. With the predicted difficulty, we can distinguish between easy and hard items to obtain high-quality labels. For easy items, the quality of their labels inferred from the crowd could be high enough to satisfy the requirement; while for hard items, the crowd could not provide high-quality labels, it is better to choose a more knowledgable crowd or employ specialized workers to label them. The experimental results demonstrate that the proposed approach by learning to distinguish between easy and hard items can significantly improve the label quality.

Cite

Text

Wang et al. "Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/413

Markdown

[Wang et al. "Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/wang2017ijcai-obtaining/) doi:10.24963/IJCAI.2017/413

BibTeX

@inproceedings{wang2017ijcai-obtaining,
  title     = {{Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing}},
  author    = {Wang, Wei and Guo, Xiang-Yu and Li, Shao-Yuan and Jiang, Yuan and Zhou, Zhi-Hua},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2964-2970},
  doi       = {10.24963/IJCAI.2017/413},
  url       = {https://mlanthology.org/ijcai/2017/wang2017ijcai-obtaining/}
}