ML Anthology
Authors
Search
About
Xiao, Yuxin
3 publications
NeurIPS
2025
KScope: A Framework for Characterizing the Knowledge Status of Language Models
Yuxin Xiao
,
Shan Chen
,
Jack Gallifant
,
Danielle Bitterman
,
Thomas Hartvigsen
,
Marzyeh Ghassemi
ICML
2025
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
Yik Siu Chan
,
Narutatsu Ri
,
Yuxin Xiao
,
Marzyeh Ghassemi
NeurIPS
2024
Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control
Yuxin Xiao
,
Chaoqun Wan
,
Yonggang Zhang
,
Wenxiao Wang
,
Binbin Lin
,
Xiaofei He
,
Xu Shen
,
Jieping Ye