ML Anthology
Authors
Search
About
Zhang, Zhuo
7 publications
AAAI
2025
Correcting Large Language Model Behavior via Influence Function
Han Zhang
,
Zhuo Zhang
,
Yi Zhang
,
Yuanzhao Zhai
,
Hanyang Peng
,
Yu Lei
,
Yue Yu
,
Hui Wang
,
Bin Liang
,
Lin Gui
,
Ruifeng Xu
NeurIPS
2024
BiScope: AI-Generated Text Detection by Checking Memorization of Preceding Tokens
Hanxi Guo
,
Siyuan Cheng
,
Xiaolong Jin
,
Zhuo Zhang
,
Kaiyuan Zhang
,
Guanhong Tao
,
Guangyu Shen
,
Xiangyu Zhang
NeurIPS
2024
Detecting Bugs with Substantial Monetary Consequences by LLM and Rule-Based Reasoning
Brian Zhang
,
Zhuo Zhang
NeurIPSW
2024
MultiVerse: Exposing Large Language Model Alignment Problems in Diverse Worlds
Xiaolong Jin
,
Zhuo Zhang
,
Guangyu Shen
,
Hanxi Guo
,
Kaiyuan Zhang
,
Siyuan Cheng
,
Xiangyu Zhang
NeurIPSW
2024
SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization
Hanxi Guo
,
Siyuan Cheng
,
Guanhong Tao
,
Guangyu Shen
,
Zhuo Zhang
,
Shengwei An
,
Kaiyuan Zhang
,
Xiangyu Zhang
NeurIPS
2023
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan
,
Zhuo Zhang
,
Guanhong Tao
,
Kaiyuan Zhang
,
Xuan Chen
,
Guangyu Shen
,
Xiangyu Zhang
ICML
2022
Constrained Optimization with Dynamic Bound-Scaling for Effective NLP Backdoor Defense
Guangyu Shen
,
Yingqi Liu
,
Guanhong Tao
,
Qiuling Xu
,
Zhuo Zhang
,
Shengwei An
,
Shiqing Ma
,
Xiangyu Zhang