ML Anthology
Authors
Search
About
Yan, Liu
2 publications
ICLR
2026
Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions
Yuntai Bao
,
Xuhong Zhang
,
Jintao Chen
,
Ge Su
,
Yuxiang Cai
,
Hao Peng
,
Sun Bing
,
Haiqin Weng
,
Liu Yan
,
Jianwei Yin
ICLR
2025
A Benchmark for Semantic Sensitive Information in LLMs Outputs
Qingjie Zhang
,
Han Qiu
,
Di Wang
,
Yiming Li
,
Tianwei Zhang
,
Wenyu Zhu
,
Haiqin Weng
,
Liu Yan
,
Chao Zhang