Tang, Leonard

8 publications

ICLR 2025 Endless Jailbreaks with Bijection Learning Brian R.Y. Huang, Maximilian Li, Leonard Tang
ICMLW 2023 Baselines for Identifying Watermarked Large Language Models Leonard Tang, Gavin Uberti, Tom Shlomi
ICMLW 2023 Consistent Explanations in the Face of Model Indeterminacy via Ensembling Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju
NeurIPS 2023 Degraded Polygons Raise Fundamental Questions of Neural Network Perception Leonard Tang, Dan Ley
ICLRW 2023 Learning the Wrong Lessons: Inserting Trojans During Knowledge Distillation Leonard Tang, Tom Shlomi, Alexander Cai
AAAI 2023 The Naughtyformer: A Transformer Understands and Moderates Adult Humor (Student Abstract) Leonard Tang, Alexander Cai, Jason Wang
CVPR 2022 PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt
NeurIPSW 2021 PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Dawn Song, Jacob Steinhardt