Qin, Zeyu
13 publications
ICLRW
2025
Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment
NeurIPSW
2024
Entropic Distribution Matching for Supervised Fine-Tuning of LLMs: Less Overfitting and Better Diversity
ICMLW
2023
Improving Adversarial Training for Multiple Perturbations Through the Lens of Uniform Stability