Multimodal Neurons in Artificial Neural Networks

Abstract

Distill articles are interactive publications and do not include traditional abstracts. This summary was written for the ML Anthology. Discovers neurons in CLIP that respond to the same concept across different modalities—such as a neuron activating for both Spider-Man imagery and the word 'spider'—revealing how vision-language models encode semantically linked information.

Cite

Text

Goh et al. "Multimodal Neurons in Artificial Neural Networks." Distill, 2021. doi:10.23915/distill.00030

Markdown

[Goh et al. "Multimodal Neurons in Artificial Neural Networks." Distill, 2021.](https://mlanthology.org/distill/2021/goh2021distill-multimodal/) doi:10.23915/distill.00030

BibTeX

@article{goh2021distill-multimodal,
  title     = {{Multimodal Neurons in Artificial Neural Networks}},
  author    = {Goh, Gabriel and Cammarata, Nick and Voss, Chelsea and Carter, Shan and Petrov, Michael and Schubert, Ludwig and Radford, Alec and Olah, Chris},
  journal   = {Distill},
  year      = {2021},
  doi       = {10.23915/distill.00030},
  url       = {https://mlanthology.org/distill/2021/goh2021distill-multimodal/}
}