ML Anthology
Authors
Search
About
Dao, James
2 publications
ICMLW
2024
An Adversarial Example for Direct Logit Attribution: Memory Management in GELU-4L
Jett Janiak
,
Can Rager
,
James Dao
,
Yeu-Tong Lau
ICMLW
2024
Challenges in Mechanistically Interpreting Model Representations
Satvik Golechha
,
James Dao