ML Anthology
Authors
Search
About
Satyanarayan, Arvind
4 publications
ICLR
2026
Semantic Regexes: Auto-Interpreting LLM Features with a Structured Language
Angie Boggust
,
Donghao Ren
,
Yannick Assogba
,
Dominik Moritz
,
Arvind Satyanarayan
,
Fred Hohman
ECCVW
2024
Explanation Alignment: Quantifying the Correctness of Model Reasoning at Scale
Hyemin Bang
,
Angie W. Boggust
,
Arvind Satyanarayan
AAAI
2022
Teaching Humans When to Defer to a Classifier via Exemplars
Hussein Mozannar
,
Arvind Satyanarayan
,
David A. Sontag
Distill
2018
The Building Blocks of Interpretability
Chris Olah
,
Arvind Satyanarayan
,
Ian Johnson
,
Shan Carter
,
Ludwig Schubert
,
Katherine Ye
,
Alexander Mordvintsev