Alignment with Human Representations Supports Robust Few-Shot Learning
Abstract
Should we care whether AI systems have representations of the world that are similar to those of humans? We provide an information-theoretic analysis that suggests that there should be a U-shaped relationship between the degree of representational alignment with humans and performance on few-shot learning tasks. We confirm this prediction empirically, finding such a relationship in an analysis of the performance of 491 computer vision models. We also show that highly-aligned models are more robust to both natural adversarial attacks and domain shifts. Our results suggest that human-alignment is often a sufficient, but not necessary, condition for models to make effective use of limited data, be robust, and generalize well.
Cite
Text
Sucholutsky and Griffiths. "Alignment with Human Representations Supports Robust Few-Shot Learning." Neural Information Processing Systems, 2023.Markdown
[Sucholutsky and Griffiths. "Alignment with Human Representations Supports Robust Few-Shot Learning." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/sucholutsky2023neurips-alignment/)BibTeX
@inproceedings{sucholutsky2023neurips-alignment,
title = {{Alignment with Human Representations Supports Robust Few-Shot Learning}},
author = {Sucholutsky, Ilia and Griffiths, Tom},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/sucholutsky2023neurips-alignment/}
}