ML Anthology
Authors
Search
About
Maté, Alejandro
3 publications
AAAI
2025
Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference
Jorge García-Carrasco
,
Alejandro Maté
,
Juan Trujillo
IJCAI
2024
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability
Jorge García-Carrasco
,
Alejandro Maté
,
Juan Trujillo
AISTATS
2024
How Does GPT-2 Predict Acronyms? Extracting and Understanding a Circuit via Mechanistic Interpretability
Jorge García-Carrasco
,
Alejandro Maté
,
Juan Carlos Trujillo