Editing a Classifier by Rewriting Its Prediction Rules

Abstract

We propose a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our method requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features.

Cite

Text

Santurkar et al. "Editing a Classifier by Rewriting Its Prediction Rules." Neural Information Processing Systems, 2021.

Markdown

[Santurkar et al. "Editing a Classifier by Rewriting Its Prediction Rules." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/santurkar2021neurips-editing/)

BibTeX

@inproceedings{santurkar2021neurips-editing,
  title     = {{Editing a Classifier by Rewriting Its Prediction Rules}},
  author    = {Santurkar, Shibani and Tsipras, Dimitris and Elango, Mahalaxmi and Bau, David and Torralba, Antonio and Madry, Aleksander},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/santurkar2021neurips-editing/}
}