Better than Balancing: Debiasing Through Data Attribution

Abstract

Spurious correlations in the training data can cause serious problems for machine learning deployment. However, common debiasing approaches which intervene on the training procedure (e.g., by adjusting the loss) can be especially sensitive to regularization and hyperparameter selection. In this paper, we advocate for a data-based perspective on model debiasing by directly targeting the root causes of the bias within the training data itself. Specifically, we leverage data attribution techniques to isolate specific examples that disproportionally drive reliance on the spurious correlation. We find that removing these training examples can efficiently debias the final classifier. Moreover, our method requires no additional hyperparameters, and does not require group annotations for the training data.

Cite

Text

Jain et al. "Better than Balancing: Debiasing Through Data Attribution." NeurIPS 2023 Workshops: DistShift, 2023.

Markdown

[Jain et al. "Better than Balancing: Debiasing Through Data Attribution." NeurIPS 2023 Workshops: DistShift, 2023.](https://mlanthology.org/neuripsw/2023/jain2023neuripsw-better/)

BibTeX

@inproceedings{jain2023neuripsw-better,
  title     = {{Better than Balancing: Debiasing Through Data Attribution}},
  author    = {Jain, Saachi and Hamidieh, Kimia and Georgiev, Kristian and Ghassemi, Marzyeh and Madry, Aleksander},
  booktitle = {NeurIPS 2023 Workshops: DistShift},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/jain2023neuripsw-better/}
}