Editing a Classifier by Rewriting Its Prediction Rules
Abstract
We propose a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our method requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features.
Cite
Text
Santurkar et al. "Editing a Classifier by Rewriting Its Prediction Rules." Neural Information Processing Systems, 2021.Markdown
[Santurkar et al. "Editing a Classifier by Rewriting Its Prediction Rules." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/santurkar2021neurips-editing/)BibTeX
@inproceedings{santurkar2021neurips-editing,
title = {{Editing a Classifier by Rewriting Its Prediction Rules}},
author = {Santurkar, Shibani and Tsipras, Dimitris and Elango, Mahalaxmi and Bau, David and Torralba, Antonio and Madry, Aleksander},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/santurkar2021neurips-editing/}
}