Alignment with Human Representations Supports Robust Few-Shot Learning

Abstract

Should we care whether AI systems have representations of the world that are similar to those of humans? We provide an information-theoretic analysis that suggests that there should be a U-shaped relationship between the degree of representational alignment with humans and performance on few-shot learning tasks. We confirm this prediction empirically, finding such a relationship in an analysis of the performance of 491 computer vision models. We also show that highly-aligned models are more robust to both natural adversarial attacks and domain shifts. Our results suggest that human-alignment is often a sufficient, but not necessary, condition for models to make effective use of limited data, be robust, and generalize well.

Cite

Text

Sucholutsky and Griffiths. "Alignment with Human Representations Supports Robust Few-Shot Learning." Neural Information Processing Systems, 2023.

Markdown

[Sucholutsky and Griffiths. "Alignment with Human Representations Supports Robust Few-Shot Learning." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/sucholutsky2023neurips-alignment/)

BibTeX

@inproceedings{sucholutsky2023neurips-alignment,
  title     = {{Alignment with Human Representations Supports Robust Few-Shot Learning}},
  author    = {Sucholutsky, Ilia and Griffiths, Tom},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/sucholutsky2023neurips-alignment/}
}