Budget-Optimized Crowdworker Allocation

Abstract

Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze data that are easy to label and more workers to analyze data that requires extra scrutiny. Our main contribution is to show how the worker label aggregation can be formulated using a probabilistic approach, and how the allocations of the number of workers to a task can be computed optimally based on task difficulty alone, without using worker profiles. Our representative target task is identifying entailment between sentences. To illustrate the proposed methodology, we conducted simulation experiments that utilize a machine learning system as a proxy for workers and demonstrate its advantages over a state-of-the-art commercial optimizer.

Cite

Text

Lai et al. "Budget-Optimized Crowdworker Allocation." Transactions on Machine Learning Research, 2026.

Markdown

[Lai et al. "Budget-Optimized Crowdworker Allocation." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/lai2026tmlr-budgetoptimized/)

BibTeX

@article{lai2026tmlr-budgetoptimized,
  title     = {{Budget-Optimized Crowdworker Allocation}},
  author    = {Lai, Sha and Ishwar, Prakash and Betke, Margrit},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/lai2026tmlr-budgetoptimized/}
}