Panickssery, Arjun

3 publications

ICLRW 2025 A Benchmark for Scalable Oversight Mechanisms Abhimanyu Pallavi Sudhir, Jackson Kaunismaa, Arjun Panickssery
NeurIPSW 2024 Analyzing Probabilistic Methods for Evaluating Agent Capabilities Axel Højmark, Govind Pimpale, Arjun Panickssery, Marius Hobbhahn, Jérémy Scheurer
NeurIPS 2024 LLM Evaluators Recognize and Favor Their Own Generations Arjun Panickssery, Samuel R. Bowman, Shi Feng