Kang, Mintong

18 publications

ICLR 2025 $r^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning Mintong Kang, Bo Li
ICML 2025 AdvAgent: Controllable Blackbox Red-Teaming on Web Agents Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li
ICLR 2025 AdvWave: Stealthy Adversarial Jailbreak Attack Against Large Audio-Language Models Mintong Kang, Chejian Xu, Bo Li
ICLRW 2025 AdvWave: Stealthy Adversarial Jailbreak Attack Against Large Audio-Language Models Mintong Kang, Chejian Xu, Shuang Yang, Bo Li
NeurIPS 2025 C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails Mintong Kang, Zhaorun Chen, Bo Li
ICLR 2025 Eia: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun
ICCV 2025 FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning Qian Feng, JiaHang Tu, Mintong Kang, Hanbin Zhao, Chao Zhang, Hui Qian
NeurIPS 2025 GuardSet-X: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset Mintong Kang, Zhaorun Chen, Chejian Xu, Jiawei Zhang, Chengquan Guo, Minzhou Pan, Ivan Revilla, Yu Sun, Bo Li
ICLR 2025 MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models Chejian Xu, Jiawei Zhang, Zhaorun Chen, Chulin Xie, Mintong Kang, Yujin Potter, Zhun Wang, Zhuowen Yuan, Alexander Xiong, Zidi Xiong, Chenhui Zhang, Lingzhi Yuan, Yi Zeng, Peiyang Xu, Chengquan Guo, Andy Zhou, Jeffrey Ziwei Tan, Xuandong Zhao, Francesco Pinto, Zhen Xiang, Yu Gai, Zinan Lin, Dan Hendrycks, Bo Li, Dawn Song
ICML 2025 ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Zhaorun Chen, Mintong Kang, Bo Li
ICLRW 2025 ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Zhaorun Chen, Mintong Kang, Shuang Yang, Bo Li
ICML 2024 C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li
ICLR 2024 COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li
ICML 2024 Certifiably Byzantine-Robust Federated Conformal Prediction Mintong Kang, Zhen Lin, Jimeng Sun, Cao Xiao, Bo Li
NeurIPS 2023 DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li
NeurIPS 2023 DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification Mintong Kang, Dawn Song, Bo Li
NeurIPS 2022 Certifying Some Distributional Fairness with Subpopulation Decomposition Mintong Kang, Linyi Li, Maurice Weber, Yang Liu, Ce Zhang, Bo Li
NeurIPS 2022 Fairness in Federated Learning via Core-Stability Bhaskar Ray Chaudhury, Linyi Li, Mintong Kang, Bo Li, Ruta Mehta