Towards a Foundation Model for Crowdsourced Label Aggregation
Abstract
Inferring ground truth from noisy, crowdsourced labels is a fundamental challenge in machine learning. For decades, the dominant paradigm has relied on dataset-specific parameter estimation, a non-scalable method that fails to transfer knowledge. Recent efforts toward universal aggregation models do not account for the structural and behavioral complexities of human-annotated crowdsourcing, resulting in poor real-world performance. To address this gap, we introduce CrowdFM, a foundation model for crowdsourced label aggregation. At its core, CrowdFM is a bipartite graph neural network that is pre-trained on a vast, domain-randomized synthetic dataset to learn diverse behavioral patterns. By leveraging a size-invariant initialization and attention-based message passing, it learns universal principles of collective intelligence and generalizes to new, unseen datasets. Extensive experiments on 22 real-world benchmarks show that our single, fixed model consistently matches or surpasses bespoke, per-dataset methods in both accuracy and efficiency. Furthermore, the representations learned by CrowdFM readily support diverse downstream applications, such as worker assessment and task assignment. Codes are available at https://github.com/liiuhaao/CrowdFM.
Cite
Text
Liu et al. "Towards a Foundation Model for Crowdsourced Label Aggregation." International Conference on Learning Representations, 2026.Markdown
[Liu et al. "Towards a Foundation Model for Crowdsourced Label Aggregation." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-foundation/)BibTeX
@inproceedings{liu2026iclr-foundation,
title = {{Towards a Foundation Model for Crowdsourced Label Aggregation}},
author = {Liu, Hao and Liu, Jiacheng and Tang, Feilong and Chen, Long and Yu, Jiadi and Zhu, Yanmin and Dong, Qiwen and Yu, Yichuan and Hou, Xiaofeng},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/liu2026iclr-foundation/}
}