Data Curation for Pluralistic Alignment
Abstract
Human feedback datasets are central to AI alignment, yet the current data collection methods do not necessarily capture diverse and complex human values. For example, existing alignment datasets focus broadly on “Harmfulness” and “Helpfulness,” but dataset curation should also aim to dissect these broad categories into more specific dimensions. In this paper, we introduce a pluralistic alignment dataset that (i) integrates the dimensions of “Toxicity”, “Emotional Awareness”, “Sensitivity and Openness”, “Helpfulness”, and “Stereotypical Bias,” (ii) reveals undiscovered tensions in human ratings on AI-generated content, (iii) shows how demographics and political ideologies shape human preferences in alignment datasets, and (iv) highlights issues in data collection and model fine-tuning. Through a large-scale human evaluation study (N=1,095 —U.S. & Germany—, five response ratings per participant, 5,475 per dimension, and 27,375 total ratings), we identify key challenges in data curation for pluralistic alignment, including the coexistence of conflicting values in human ratings, demographic imbalances, and limitations in reward models and cost functions that prohibit them from dealing with the diversity of values in the datasets. Based on these findings, we develop a series of considerations that researchers and practitioners should consider to achieve inclusive AI models. By analyzing how human feedback varies across social groups and value dimensions, we shed light on the role of data curation in achieving bidirectional human-AI alignment—where AI systems are shaped by diverse human input and, in turn, surface the complexity and plurality of human values.
Cite
Text
Ali et al. "Data Curation for Pluralistic Alignment." ICLR 2025 Workshops: MLDPR, 2025.Markdown
[Ali et al. "Data Curation for Pluralistic Alignment." ICLR 2025 Workshops: MLDPR, 2025.](https://mlanthology.org/iclrw/2025/ali2025iclrw-data/)BibTeX
@inproceedings{ali2025iclrw-data,
title = {{Data Curation for Pluralistic Alignment}},
author = {Ali, Dalia and Kocak, Aysenur and Zhao, Dora and Koenecke, Allison and Papakyriakopoulos, Orestis},
booktitle = {ICLR 2025 Workshops: MLDPR},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/ali2025iclrw-data/}
}