Mittal, Prateek

57 publications

ICML 2025 Adapting to Evolving Adversaries with Regularized Continual Robust Training Sihui Dai, Christian Cianfarani, Vikash Sehwag, Prateek Mittal, Arjun Bhagoji
ICLR 2025 Capturing the Temporal Dependence of Training Data Influence Jiachen T. Wang, Dawn Song, James Zou, Prateek Mittal, Ruoxi Jia
ICLR 2025 Data Shapley in One Training Run Jiachen T. Wang, Prateek Mittal, Dawn Song, Ruoxi Jia
ICLR 2025 Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, Wenxuan Zhou
ICLR 2025 On Evaluating the Durability of Safeguards for Open-Weight LLMs Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie, Luxi He, Matthew Jagielski, Milad Nasr, Prateek Mittal, Peter Henderson
CVPR 2025 PatchDEMUX: A Certifiably Robust Framework for Multi-Label Classifiers Against Adversarial Patches Dennis Jacob, Chong Xiang, Prateek Mittal
ICLR 2025 Privacy Auditing of Large Language Models Ashwinee Panda, Xinyu Tang, Christopher A. Choquette-Choo, Milad Nasr, Prateek Mittal
TMLR 2025 Private Fine-Tuning of Large Language Models with Zeroth-Order Optimization Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal
NeurIPS 2025 ReliabilityRAG: Effective and Provably Robust Defense for RAG-Based Web-Search Zeyu Shen, Basileal Yoseph Imana, Tong Wu, Chong Xiang, Prateek Mittal, Aleksandra Korolova
ICLR 2025 SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, Prateek Mittal
ICLR 2025 Safety Alignment Should Be Made More than Just a Few Tokens Deep Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu, Xiao Ma, Subhrajit Roy, Ahmad Beirami, Prateek Mittal, Peter Henderson
ICML 2024 A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization Ashwinee Panda, Xinyu Tang, Saeed Mahloujifar, Vikash Sehwag, Prateek Mittal
ICML 2024 Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson
ICLRW 2024 Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson
ICLR 2024 BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection Tinghao Xie, Xiangyu Qi, Ping He, Yiming Li, Jiachen T. Wang, Prateek Mittal
ICLR 2024 BrainLM: A Foundation Model for Brain Activity Recordings Josue Ortega Caro, Antonio Henrique de Oliveira Fonseca, Syed A Rizvi, Matteo Rosati, Christopher Averill, James L Cross, Prateek Mittal, Emanuele Zappala, Rahul Madhav Dhodapkar, Chadi Abdallah, David van Dijk
ICMLW 2024 Certifiably Robust RAG Against Retrieval Corruption Chong Xiang, Tong Wu, Zexuan Zhong, David Wagner, Danqi Chen, Prateek Mittal
AISTATS 2024 Efficient Data Shapley for Weighted Nearest Neighbor Algorithms Jiachen T. Wang, Prateek Mittal, Ruoxi Jia
ICLR 2024 Fine-Tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend to! Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson
NeurIPS 2024 GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration Jiachen T. Wang, Tong Wu, Dawn Song, Prateek Mittal, Ruoxi Jia
NeurIPSW 2024 Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, Wenxuan Zhou
ICMLW 2024 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal
ICMLW 2024 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal
ICMLW 2024 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal
ICMLW 2024 Privacy Auditing of Large Language Models Ashwinee Panda, Xinyu Tang, Milad Nasr, Christopher A. Choquette-Choo, Prateek Mittal
ICMLW 2024 Privacy Auditing of Large Language Models Ashwinee Panda, Xinyu Tang, Milad Nasr, Christopher A. Choquette-Choo, Prateek Mittal
ICLR 2024 Privacy-Preserving In-Context Learning for Large Language Models Tong Wu, Ashwinee Panda, Jiachen T. Wang, Prateek Mittal
ICMLW 2024 Private Fine-Tuning of Large Language Models with Zeroth-Order Optimization Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal
ICLR 2024 Teach LLMs to Phish: Stealing Private Information from Language Models Ashwinee Panda, Christopher A. Choquette-Choo, Zhengming Zhang, Yaoqing Yang, Prateek Mittal
AAAI 2024 Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, Peter Henderson, Mengdi Wang, Prateek Mittal
NeurIPS 2023 A Privacy-Friendly Approach to Data Valuation Jiachen Wang, Yuqing Zhu, Yu-Xiang Wang, Ruoxi Jia, Prateek Mittal
NeurIPS 2023 A Randomized Approach to Tight Privacy Accounting Jiachen Wang, Saeed Mahloujifar, Tong Wu, Ruoxi Jia, Prateek Mittal
NeurIPS 2023 Characterizing the Optimal $0-1$ Loss for Multi-Class Classification with a Test-Time Attacker Sihui Dai, Wenxin Ding, Arjun Nitin Bhagoji, Daniel Cullina, Heather Zheng, Ben Zhao, Prateek Mittal
ICMLW 2023 Characterizing the Optimal $0-1$ Loss for Multi-Class Classification with a Test-Time Attacker Sihui Dai, Wenxin Ding, Arjun Nitin Bhagoji, Daniel Cullina, Ben Y. Zhao, Haitao Zheng, Prateek Mittal
ICMLW 2023 Differentially Private Generation of High Fidelity Samples from Diffusion Models Vikash Sehwag, Ashwinee Panda, Ashwini Pokle, Xinyu Tang, Saeed Mahloujifar, Mung Chiang, J Zico Kolter, Prateek Mittal
NeurIPS 2023 Differentially Private Image Classification by Learning Priors from Random Processes Xinyu Tang, Ashwinee Panda, Vikash Sehwag, Prateek Mittal
ICML 2023 Effectively Using Public Data in Privacy Preserving Machine Learning Milad Nasr, Saeed Mahloujifar, Xinyu Tang, Prateek Mittal, Amir Houmansadr
ICML 2023 MultiRobustBench: Benchmarking Robustness Against Multiple Attacks Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal
ICLR 2023 Revisiting the Assumption of Latent Separability for Backdoor Defenses Xiangyu Qi, Tinghao Xie, Yiming Li, Saeed Mahloujifar, Prateek Mittal
ICMLW 2023 Teach GPT to Phish Ashwinee Panda, Zhengming Zhang, Yaoqing Yang, Prateek Mittal
ICML 2023 Uncovering Adversarial Risks of Test-Time Adaptation Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal
ICMLW 2023 Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, Mengdi Wang, Prateek Mittal
AISTATS 2022 SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification Ashwinee Panda, Saeed Mahloujifar, Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal
NeurIPS 2022 Formulating Robustness Against Unforeseen Attacks Sihui Dai, Saeed Mahloujifar, Prateek Mittal
NeurIPSW 2022 Lower Bounds on 0-1 Loss for Multi-Class Classification with a Test-Time Attacker Sihui Dai, Wenxin Ding, Arjun Nitin Bhagoji, Daniel Cullina, Prateek Mittal, Ben Y. Zhao
ICML 2022 Neurotoxin: Durable Backdoors in Federated Learning Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael Mahoney, Prateek Mittal, Ramchandran Kannan, Joseph Gonzalez
NeurIPS 2022 Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning Jiachen T. Wang, Saeed Mahloujifar, Shouda Wang, Ruoxi Jia, Prateek Mittal
ICLR 2022 Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal
NeurIPS 2022 Understanding Robust Learning Through the Lens of Representation Similarities Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Zhao, Heather Zheng, Prateek Mittal
NeurIPSW 2021 A Novel Self-Distillation Architecture to Defeat Membership Inference Attacks Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad Nasr, Amir Houmansadr, Prateek Mittal
FnTML 2021 Advances and Open Problems in Federated Learning Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista A. Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaïd Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konecný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, Sen Zhao
ICML 2021 Lower Bounds on Cross-Entropy Loss in the Presence of Test-Time Adversaries Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal
ICLR 2021 SSD: A Unified Framework for Self-Supervised Outlier Detection Vikash Sehwag, Mung Chiang, Prateek Mittal
NeurIPS 2020 HYDRA: Pruning Adversarially Robust Neural Networks Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana
ICML 2019 Analyzing Federated Learning Through an Adversarial Lens Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, Seraphin Calo
NeurIPS 2019 Lower Bounds on Adversarial Robustness from Optimal Transport Arjun Nitin Bhagoji, Daniel Cullina, Prateek Mittal
NeurIPS 2018 PAC-Learning in the Presence of Adversaries Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal