Gallego, Víctor

3 publications

ICLRW 2025 MetaSC: Test-Time Safety Specification Optimization for Language Models Victor Gallego
ICMLW 2024 Merging Improves Self-Critique Against Jailbreak Attacks Victor Gallego
AAAI 2019 Reinforcement Learning Under Threats Víctor Gallego, Roi Naveiro, David Ríos Insua