Trujillo, Juan

2 publications

AAAI 2025 Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference Jorge García-Carrasco, Alejandro Maté, Juan Trujillo
IJCAI 2024 Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability Jorge García-Carrasco, Alejandro Maté, Juan Trujillo