Abbas, Alexandra

1 publications

ICLRW 2025 Latent Adversarial Training Improves the Representation of Refusal Alexandra Abbas, Nora Petrova, Hélios Lyons, Natalia Perez-Campanero