AGR: Age Group Fairness Reward for Bias Mitigation in LLMs

Abstract

LLMs can exhibit age biases, resulting in unequal treatment of individuals across age groups. While much research has addressed racial and gender biases, age bias remains little explored. The scarcity of instruction-tuning and preference datasets for age bias hampers its detection and measurement, and existing fine-tuning methods seldom address age-related fairness. In this paper, we construct age bias preference datasets and instruction-tuning datasets for RLHF. We introduce ARG, an age fairness reward to reduce differences in the response quality of LLMs across different age groups. Extensive experiments demonstrate this reward significantly improves response accuracy and reduces performance disparities across age groups. Our source code and datasets are available at the anonymous \href{https://anonymous.4open.science/r/FairRLHF-D445/readme.md}link.

Cite

Text

Cao et al. "AGR: Age Group Fairness Reward for Bias Mitigation in LLMs." NeurIPS 2024 Workshops: Pluralistic-Alignment, 2024.

Markdown

[Cao et al. "AGR: Age Group Fairness Reward for Bias Mitigation in LLMs." NeurIPS 2024 Workshops: Pluralistic-Alignment, 2024.](https://mlanthology.org/neuripsw/2024/cao2024neuripsw-agr/)

BibTeX

@inproceedings{cao2024neuripsw-agr,
  title     = {{AGR: Age Group Fairness Reward for Bias Mitigation in LLMs}},
  author    = {Cao, Shuirong and Cheng, Ruoxi and Wang, Zhiqiang},
  booktitle = {NeurIPS 2024 Workshops: Pluralistic-Alignment},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/cao2024neuripsw-agr/}
}