Liu, Yihong

1 publications

NeurIPS 2025 Refusal Direction Is Universal Across Safety-Aligned Languages Xinpeng Wang, Mingyang Wang, Yihong Liu, Hinrich Schuetze, Barbara Plank