ML Anthology
Authors
Search
About
Poovendran, Radha
18 publications
AAAI
2025
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
Fengqing Jiang
,
Zhangchen Xu
,
Luyao Niu
,
Bill Yuchen Lin
,
Radha Poovendran
ICLRW
2025
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
Yuetai Li
,
Zhangchen Xu
,
Fengqing Jiang
,
Luyao Niu
,
Dinuka Sahabandu
,
Bhaskar Ramasubramanian
,
Radha Poovendran
ICLR
2025
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
,
Fengqing Jiang
,
Luyao Niu
,
Yuntian Deng
,
Radha Poovendran
,
Yejin Choi
,
Bill Yuchen Lin
ICLRW
2025
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
Fengqing Jiang
,
Zhangchen Xu
,
Yuetai Li
,
Luyao Niu
,
Zhen Xiang
,
Bo Li
,
Bill Yuchen Lin
,
Radha Poovendran
ICLRW
2025
Stronger Models Are NOT Always Stronger Teachers for Instruction Tuning
Zhangchen Xu
,
Fengqing Jiang
,
Luyao Niu
,
Bill Yuchen Lin
,
Radha Poovendran
ICML
2025
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Bill Yuchen Lin
,
Ronan Le Bras
,
Kyle Richardson
,
Ashish Sabharwal
,
Radha Poovendran
,
Peter Clark
,
Yejin Choi
ICLRW
2024
ArtPrompt: ASCII Art-Based Jailbreak Attacks Against Aligned LLMs
Fengqing Jiang
,
Zhangchen Xu
,
Luyao Niu
,
Zhen Xiang
,
Bhaskar Ramasubramanian
,
Bo Li
,
Radha Poovendran
ICLR
2024
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
Zhen Xiang
,
Fengqing Jiang
,
Zidi Xiong
,
Bhaskar Ramasubramanian
,
Radha Poovendran
,
Bo Li
NeurIPSW
2024
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
Fengqing Jiang
,
Zhangchen Xu
,
Luyao Niu
,
Bill Yuchen Lin
,
Radha Poovendran
ICLRW
2024
SafeDecoding: Defending Against Jailbreak Attacks via Safety-Aware Decoding
Zhangchen Xu
,
Fengqing Jiang
,
Luyao Niu
,
Jinyuan Jia
,
Bill Yuchen Lin
,
Radha Poovendran
NeurIPSW
2023
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
Zhen Xiang
,
Fengqing Jiang
,
Zidi Xiong
,
Bhaskar Ramasubramanian
,
Radha Poovendran
,
Bo Li
NeurIPS
2023
FedGame: A Game-Theoretic Defense Against Backdoor Attacks in Federated Learning
Jinyuan Jia
,
Zhuowen Yuan
,
Dinuka Sahabandu
,
Luyao Niu
,
Arezoo Rajabi
,
Bhaskar Ramasubramanian
,
Bo Li
,
Radha Poovendran
NeurIPSW
2023
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
Fengqing Jiang
,
Zhangchen Xu
,
Luyao Niu
,
Boxin Wang
,
Jinyuan Jia
,
Bo Li
,
Radha Poovendran
IJCAI
2023
Learning Dissemination Strategies for External Sources in Opinion Dynamic Models with Cognitive Biases
Abdullah Al Maruf
,
Luyao Niu
,
Bhaskar Ramasubramanian
,
Andrew Clark
,
Radha Poovendran
CVPRW
2019
Dropping Pixels for Adversarial Robustness
Hossein Hosseini
,
Sreeram Kannan
,
Radha Poovendran
CVPRW
2018
Assessing Shape Bias Property of Convolutional Neural Networks
Hossein Hosseini
,
Baicen Xiao
,
Mayoore Jaiswal
,
Radha Poovendran
CVPRW
2018
Semantic Adversarial Examples
Hossein Hosseini
,
Radha Poovendran
CVPRW
2017
Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos
Hossein Hosseini
,
Baicen Xiao
,
Radha Poovendran