ML Anthology
Authors
Search
About
Abbas, Alexandra
1 publications
ICLRW
2025
Latent Adversarial Training Improves the Representation of Refusal
Alexandra Abbas
,
Nora Petrova
,
Hélios Lyons
,
Natalia Perez-Campanero