Zhang, Zhuo

9 publications

ICLR 2026 GEPO: Group Expectation Policy Optimization for Stable Heterogeneous Reinforcement Learning Han Zhang, RuibinZheng, Zexuan Yi, Zhuo Zhang, Hanyang Peng, Hui Wang, Jiayin Qi, Binxing Fang, Ruifeng Xu, Yue Yu
ICLR 2026 SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC Jinglong Luo, Zhuo Zhang, Yehong Zhang, Shiyu Liu, Ye Dong, Hui Wang, Yue Yu, Xun Zhou, Zenglin Xu
AAAI 2025 Correcting Large Language Model Behavior via Influence Function Han Zhang, Zhuo Zhang, Yi Zhang, Yuanzhao Zhai, Hanyang Peng, Yu Lei, Yue Yu, Hui Wang, Bin Liang, Lin Gui, Ruifeng Xu
NeurIPS 2024 BiScope: AI-Generated Text Detection by Checking Memorization of Preceding Tokens Hanxi Guo, Siyuan Cheng, Xiaolong Jin, Zhuo Zhang, Kaiyuan Zhang, Guanhong Tao, Guangyu Shen, Xiangyu Zhang
NeurIPS 2024 Detecting Bugs with Substantial Monetary Consequences by LLM and Rule-Based Reasoning Brian Zhang, Zhuo Zhang
NeurIPSW 2024 MultiVerse: Exposing Large Language Model Alignment Problems in Diverse Worlds Xiaolong Jin, Zhuo Zhang, Guangyu Shen, Hanxi Guo, Kaiyuan Zhang, Siyuan Cheng, Xiangyu Zhang
NeurIPSW 2024 SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization Hanxi Guo, Siyuan Cheng, Guanhong Tao, Guangyu Shen, Zhuo Zhang, Shengwei An, Kaiyuan Zhang, Xiangyu Zhang
NeurIPS 2023 ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP Lu Yan, Zhuo Zhang, Guanhong Tao, Kaiyuan Zhang, Xuan Chen, Guangyu Shen, Xiangyu Zhang
ICML 2022 Constrained Optimization with Dynamic Bound-Scaling for Effective NLP Backdoor Defense Guangyu Shen, Yingqi Liu, Guanhong Tao, Qiuling Xu, Zhuo Zhang, Shengwei An, Shiqing Ma, Xiangyu Zhang