ML Anthology
Authors
Search
About
Liu, Yihong
1 publications
NeurIPS
2025
Refusal Direction Is Universal Across Safety-Aligned Languages
Xinpeng Wang
,
Mingyang Wang
,
Yihong Liu
,
Hinrich Schuetze
,
Barbara Plank