Shrivastava, Aditya

1 publications

NeurIPSW 2024 Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models Neel Jain, Aditya Shrivastava, Chenyang Zhu, Daben Liu, Alfy Samuel, Ashwinee Panda, Anoop Kumar, Micah Goldblum, Tom Goldstein