Pandey, Punya Syon

1 publications

ICLR 2026 SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests Punya Syon Pandey, Lê Hải Sơn, Devansh Bhardwaj, Zhijing Jin