Data Selection for LLM Alignment Using Fine-Grained Preferences
Abstract
Large language models (LLMs) alignment aims to ensure that the behavior of LLMs meets human preferences. While collecting data from multiple fine-grained, aspect-specific preferences becomes more and more feasible, existing alignment methods typically work on a single preference and thus struggle with conflicts inherent in such aggregated datasets. As one early attempt, in this paper, we propose a data-centric approach to align LLMs through the effective use of fine-grained preferences. Specifically, we formulate the problem as a direct fine-grained preference optimization and introduce preference divergence (PD) that quantifies inter-aspect preference conflicts. Instead of directly tackling the consequent complicated optimization, we recast it as a data selection problem and propose a simple yet effective strategy, which identifies a subset of data corresponding to the most negative PD values, for efficient training. We theoretically analyze the loss-bound optimality of our selection strategy and conduct extensive empirical studies on varied settings and datasets to demonstrate that our practical selection method could achieve consistent improvement against standard full-data alignment, using even just 30% of the data. Our work shares a line that LLM alignment using fine-grained preferences is highly feasible.
Cite
Text
Zhang et al. "Data Selection for LLM Alignment Using Fine-Grained Preferences." International Conference on Learning Representations, 2026.Markdown
[Zhang et al. "Data Selection for LLM Alignment Using Fine-Grained Preferences." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhang2026iclr-data/)BibTeX
@inproceedings{zhang2026iclr-data,
title = {{Data Selection for LLM Alignment Using Fine-Grained Preferences}},
author = {Zhang, Jia and Liu, Yao and Zhang, Chen-Xi and Liu, Yi and Jin, Yi-Xuan and Guo, Lan-Zhe and Li, Yu-Feng},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/zhang2026iclr-data/}
}