ML Anthology
Authors
Search
About
Dong, Peiran
5 publications
ICLR
2025
Causally Motivated Sycophancy Mitigation for Large Language Models
Haoxi Li
,
Xueyang Tang
,
Jie Zhang
,
Song Guo
,
Sikai Bai
,
Peiran Dong
,
Yue Yu
ICLR
2025
Durable Quantization Conditioned Misalignment Attack on Large Language Models
Peiran Dong
,
Haowei Li
,
Song Guo
ICML
2024
Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring
Jiewei Zhang
,
Song Guo
,
Peiran Dong
,
Jie Zhang
,
Ziming Liu
,
Yue Yu
,
Xiao-Ming Wu
NeurIPS
2024
Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing
Peiran Dong
,
Bingjie Wang
,
Song Guo
,
Junxiao Wang
,
Jie Zhang
,
Zicong Hong
NeurIPS
2023
Towards Test-Time Refusals via Concept Negation
Peiran Dong
,
Song Guo
,
Junxiao Wang
,
Bingjie Wang
,
Jiewei Zhang
,
Ziming Liu