Niu, Luyao

11 publications

AAAI 2025 ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran
ICLRW 2025 CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran
ICLR 2025 Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin
ICLRW 2025 SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Fengqing Jiang, Zhangchen Xu, Yuetai Li, Luyao Niu, Zhen Xiang, Bo Li, Bill Yuchen Lin, Radha Poovendran
ICLRW 2025 Stronger Models Are NOT Always Stronger Teachers for Instruction Tuning Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Radha Poovendran
ICLRW 2024 ArtPrompt: ASCII Art-Based Jailbreak Attacks Against Aligned LLMs Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran
NeurIPSW 2024 ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran
ICLRW 2024 SafeDecoding: Defending Against Jailbreak Attacks via Safety-Aware Decoding Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
NeurIPS 2023 FedGame: A Game-Theoretic Defense Against Backdoor Attacks in Federated Learning Jinyuan Jia, Zhuowen Yuan, Dinuka Sahabandu, Luyao Niu, Arezoo Rajabi, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran
NeurIPSW 2023 Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran
IJCAI 2023 Learning Dissemination Strategies for External Sources in Opinion Dynamic Models with Cognitive Biases Abdullah Al Maruf, Luyao Niu, Bhaskar Ramasubramanian, Andrew Clark, Radha Poovendran