Multimodal Neurons in Artificial Neural Networks

Goh, Gabriel; Cammarata, Nick; Voss, Chelsea; Carter, Shan; Petrov, Michael; Schubert, Ludwig; Radford, Alec; Olah, Chris

doi:10.23915/distill.00030

Multimodal Neurons in Artificial Neural Networks

Gabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah

Distill 2021

doi:10.23915/distill.00030 /distill/2021/goh2021distill-multimodal/

Abstract

Distill articles are interactive publications and do not include traditional abstracts. This summary was written for the ML Anthology. Discovers neurons in CLIP that respond to the same concept across different modalities—such as a neuron activating for both Spider-Man imagery and the word 'spider'—revealing how vision-language models encode semantically linked information.

Distill Code Semantic Scholar

Cite

Text

Goh et al. "Multimodal Neurons in Artificial Neural Networks." Distill, 2021. doi:10.23915/distill.00030

Markdown

[Goh et al. "Multimodal Neurons in Artificial Neural Networks." Distill, 2021.](https://mlanthology.org/distill/2021/goh2021distill-multimodal/) doi:10.23915/distill.00030

BibTeX

@article{goh2021distill-multimodal,
  title     = {{Multimodal Neurons in Artificial Neural Networks}},
  author    = {Goh, Gabriel and Cammarata, Nick and Voss, Chelsea and Carter, Shan and Petrov, Michael and Schubert, Ludwig and Radford, Alec and Olah, Chris},
  journal   = {Distill},
  year      = {2021},
  doi       = {10.23915/distill.00030},
  url       = {https://mlanthology.org/distill/2021/goh2021distill-multimodal/}
}