The Building Blocks of Interpretability

Abstract

Distill articles are interactive publications and do not include traditional abstracts. This summary was written for the ML Anthology. Presents a framework for composing interpretability techniques as modular building blocks, enabling the construction of rich interfaces that reveal what neural networks detect and how they build up understanding across layers.

Cite

Text

Olah et al. "The Building Blocks of Interpretability." Distill, 2018. doi:10.23915/distill.00010

Markdown

[Olah et al. "The Building Blocks of Interpretability." Distill, 2018.](https://mlanthology.org/distill/2018/olah2018distill-building/) doi:10.23915/distill.00010

BibTeX

@article{olah2018distill-building,
  title     = {{The Building Blocks of Interpretability}},
  author    = {Olah, Chris and Satyanarayan, Arvind and Johnson, Ian and Carter, Shan and Schubert, Ludwig and Ye, Katherine and Mordvintsev, Alexander},
  journal   = {Distill},
  year      = {2018},
  doi       = {10.23915/distill.00010},
  url       = {https://mlanthology.org/distill/2018/olah2018distill-building/}
}