Hazarika, Devamanyu
8 publications
NeurIPSW
2024
LLM-PIRATE: A Benchmark for Indirect Prompt Injection Attacks in Large Language Models
NeurIPSW
2023
Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language
NeurIPSW
2023
Supervised Fine-Tuning of Large Language Models on Human Demonstrations Through the Lens of Memorization