ML Anthology
Authors
Search
About
Shrivastava, Aditya
1 publications
NeurIPSW
2024
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Neel Jain
,
Aditya Shrivastava
,
Chenyang Zhu
,
Daben Liu
,
Alfy Samuel
,
Ashwinee Panda
,
Anoop Kumar
,
Micah Goldblum
,
Tom Goldstein