Panda, Ayush

2 publications

ICLRW 2025 Patterns and Mechanisms of Contrastive Activation Engineering Yixiong Hao, Ayush Panda, Stepan Shabalin, Sheikh Abdur Raheem Ali
NeurIPSW 2023 Eliciting Language Model Behaviors Using Reverse Language Models Jacob Pfau, Alex Infanger, Abhay Sheshadri, Ayush Panda, Julian Michael, Curtis Huebner