Wang, Xinpeng

5 publications

NeurIPS 2025 Refusal Direction Is Universal Across Safety-Aligned Languages Xinpeng Wang, Mingyang Wang, Yihong Liu, Hinrich Schuetze, Barbara Plank
ICLR 2025 Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation Xinpeng Wang, Chengzhi Hu, Paul Röttger, Barbara Plank
NeurIPSW 2024 FinerCut: Finer-Grained Interpretable Layer Pruning for Large Language Models Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi
IJCAI 2024 On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, Jing Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie
IJCAI 2021 Deep Reinforcement Learning for Multi-Contact Motion Planning of Hexapod Robots Huiqiao Fu, Kaiqiang Tang, Peng Li, Wenqi Zhang, Xinpeng Wang, Guizhou Deng, Tao Wang, Chunlin Chen