A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models

Abstract

We present a scalable post-processing algorithm for debiasing trained models, including deep neural networks (DNNs), which we prove to be near-optimal by bounding its excess Bayes risk. We empirically validate its advantages on standard benchmark datasets across both classical algorithms as well as modern DNN architectures and demonstrate that it outperforms previous post-processing methods while performing on par with in-processing. In addition, we show that the proposed algorithm is particularly effective for models trained at scale where post-processing is a natural and practical choice.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Alabdulmohsin and Lucic. "A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models." Neural Information Processing Systems, 2021.

Markdown

[Alabdulmohsin and Lucic. "A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/alabdulmohsin2021neurips-nearoptimal/)

BibTeX

@inproceedings{alabdulmohsin2021neurips-nearoptimal,
  title     = {{A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models}},
  author    = {Alabdulmohsin, Ibrahim M and Lucic, Mario},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/alabdulmohsin2021neurips-nearoptimal/}
}