Information-Theoretic Quantification of Inherent Discrimination Bias in Training Data for Supervised Learning

Abstract

Algorithmic fairness research has mainly focused on adapting learning models to mitigate discrimination based on protected attributes, yet understanding inherent biases in training data remains largely unexplored. Quantifying these biases is crucial for informed data engineering, as data mining and model development often occur separately. We address this by developing an information-theoretic framework to quantify the marginal impacts of dataset features on the discrimination bias of downstream predictors. We postulate a set of desired properties for candidate discrimination measures and derive measures that (partially) satisfy them. Distinct sets of these properties align with distinct fairness criteria like demographic parity or equalized odds, which we show can be in disagreement and not simultaneously satisfied by a single measure. We use the Shapley value to determine individual features' contributions to overall discrimination, and prove its effectiveness in eliminating redundancy. We validate our measures through a comprehensive empirical study on numerous real-world and synthetic datasets. For synthetic data, we use a parametric linear structural causal model to generate diverse data correlation structures. Our analysis provides empirically validated guidelines for selecting discrimination measures based on data conditions and fairness criteria, establishing a robust framework for quantifying inherent discrimination bias in data.

Cite

Text

Aldarmini and Nafea. "Information-Theoretic Quantification of Inherent Discrimination Bias in Training Data for Supervised Learning." ICLR 2025 Workshops: Data_Problems, 2025.

Markdown

[Aldarmini and Nafea. "Information-Theoretic Quantification of Inherent Discrimination Bias in Training Data for Supervised Learning." ICLR 2025 Workshops: Data_Problems, 2025.](https://mlanthology.org/iclrw/2025/aldarmini2025iclrw-informationtheoretic/)

BibTeX

@inproceedings{aldarmini2025iclrw-informationtheoretic,
  title     = {{Information-Theoretic Quantification of Inherent Discrimination Bias in Training Data for Supervised Learning}},
  author    = {Aldarmini, Sokrat and Nafea, Mohamed S},
  booktitle = {ICLR 2025 Workshops: Data_Problems},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/aldarmini2025iclrw-informationtheoretic/}
}