ML Anthology
Authors
Search
About
He, Xuanli
7 publications
ICLR
2025
An Auditing Test to Detect Behavioral Shift in Language Models
Leo Richter
,
Xuanli He
,
Pasquale Minervini
,
Matt Kusner
ICMLW
2024
An Auditing Test to Detect Behavioral Shift in Language Models
Leo Richter
,
Nitin Agrawal
,
Xuanli He
,
Pasquale Minervini
,
Matt Kusner
NeurIPSW
2024
Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Yu Zhao
,
Xiaotang Du
,
Giwon Hong
,
Aryo Pradipta Gema
,
Alessio Devoto
,
Hongru Wang
,
Xuanli He
,
Kam-Fai Wong
,
Pasquale Minervini
ICLRW
2024
Attacks on Third-Party APIs of Large Language Models
Wanru Zhao
,
Vidit Khazanchi
,
Haodi Xing
,
Xuanli He
,
Qiongkai Xu
,
Nicholas Donald Lane
TMLR
2024
Generative Models Are Self-Watermarked: Declaring Model Authentication Through Re-Generation
Aditya Desu
,
Xuanli He
,
Qiongkai Xu
,
Wei Lu
NeurIPS
2022
CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks
Xuanli He
,
Qiongkai Xu
,
Yi Zeng
,
Lingjuan Lyu
,
Fangzhao Wu
,
Jiwei Li
,
Ruoxi Jia
AAAI
2022
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark
Xuanli He
,
Qiongkai Xu
,
Lingjuan Lyu
,
Fangzhao Wu
,
Chenguang Wang