Hu, Sihao
9 publications
ICLR
2025
Booster: Tackling Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Perturbation
NeurIPS
2024
Lisa: Lazy Safety Alignment for Large Language Models Against Harmful Fine-Tuning Attack
9 publications