Dong, Peiran

5 publications

ICLR 2025 Causally Motivated Sycophancy Mitigation for Large Language Models Haoxi Li, Xueyang Tang, Jie Zhang, Song Guo, Sikai Bai, Peiran Dong, Yue Yu
ICLR 2025 Durable Quantization Conditioned Misalignment Attack on Large Language Models Peiran Dong, Haowei Li, Song Guo
ICML 2024 Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring Jiewei Zhang, Song Guo, Peiran Dong, Jie Zhang, Ziming Liu, Yue Yu, Xiao-Ming Wu
NeurIPS 2024 Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing Peiran Dong, Bingjie Wang, Song Guo, Junxiao Wang, Jie Zhang, Zicong Hong
NeurIPS 2023 Towards Test-Time Refusals via Concept Negation Peiran Dong, Song Guo, Junxiao Wang, Bingjie Wang, Jiewei Zhang, Ziming Liu