Nagpal, Chirag

17 publications

ICCV 2025 Erasing More than Intended? How Concept Erasure Degrades the Generation of Non-Target Concepts Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic, Zarana Parekh, Natalie Harris, Sarah Young, Chirag Nagpal, Najoung Kim, Junfeng He, Cristina Nader Vasconcelos, Deepak Ramachandran, Golnoosh Farnadi, Katherine Heller, Mohammad Havaei, Negar Rostamzadeh

ICML 2025 InfAlign: Inference-Aware Language Model Alignment Ananth Balashankar, Ziteng Sun, Jonathan Berant, Jacob Eisenstein, Michael Collins, Adrian Hutter, Jong Lee, Chirag Nagpal, Flavien Prost, Aradhana Sinha, Ananda Theertha Suresh, Ahmad Beirami

ICLR 2025 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Amrith Setlur, Chirag Nagpal, Adam Fisch, Xinyang Geng, Jacob Eisenstein, Rishabh Agarwal, Alekh Agarwal, Jonathan Berant, Aviral Kumar

TMLR 2025 Robust Preference Optimization Through Reward Model Distillation Adam Fisch, Jacob Eisenstein, Vicky Zayats, Alekh Agarwal, Ahmad Beirami, Chirag Nagpal, Peter Shaw, Jonathan Berant

ICML 2025 Theoretical Guarantees on the Best-of-N Alignment Policy Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D’Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

NeurIPS 2025 Understanding Challenges to the Interpretation of Disaggregated Evaluations of Algorithmic Fairness Stephen R Pfohl, Natalie Harris, Chirag Nagpal, David Madras, Vishwali Mhasawade, Olawale Elijah Salaudeen, Awa Dieng, Shannon Sequeira, Santiago Eduardo Arciniegas, Lillian Sung, Nnamdi Peter Okechukwu Ezeanochie, Heather Cole-Lewis, Katherine A Heller, Sanmi Koyejo, Alexander Nicholas D'Amour

ICML 2024 Transforming and Combining Rewards for Aligning Large Language Models Zihao Wang, Chirag Nagpal, Jonathan Berant, Jacob Eisenstein, Alexander Nicholas D’Amour, Sanmi Koyejo, Victor Veitch

NeurIPS 2023 Participatory Personalization in Classification Hailey Joren, Chirag Nagpal, Katherine A. Heller, Berk Ustun

ICMLW 2023 Participatory Personalization in Classification Hailey Joren, Chirag Nagpal, Katherine A Heller, Berk Ustun

NeurIPSW 2023 Reward Model Aggregation Zihao Wang, Chirag Nagpal, Alexander D'Amour, Victor Veitch, Sanmi Koyejo

NeurIPSW 2023 Reward Model Underspecification in Language Model Alignment Jacob Eisenstein, Jonathan Berant, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alexander Nicholas D'Amour, Krishnamurthy Dj Dvijotham, Katherine A Heller, Stephen Robert Pfohl, Deepak Ramachandran

NeurIPSW 2023 Understanding Subgroup Performance Differences of Fair Predictors Using Causal Models Stephen Robert Pfohl, Natalie Harris, Chirag Nagpal, David Madras, Vishwali Mhasawade, Olawale Elijah Salaudeen, Katherine A Heller, Sanmi Koyejo, Alexander Nicholas D'Amour

MLHC 2022 Auton-Survival: An Open-Source Package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Event Data Chirag Nagpal, Willa Potosnak, Artur Dubrawski

NeurIPSW 2022 Participatory Systems for Personalized Prediction Hailey Joren, Chirag Nagpal, Katherine A Heller, Berk Ustun

NeurIPSW 2022 Participatory Systems for Personalized Prediction Hailey Joren, Chirag Nagpal, Katherine A Heller, Berk Ustun

MLHC 2021 Deep Cox Mixtures for Survival Regression Chirag Nagpal, Steve Yadlowsky, Negar Rostamzadeh, Katherine Heller

MLHC 2019 Dynamically Personalized Detection of Hemorrhage Chirag Nagpal, Xinyu Li, Michael R. Pinsky, Artur Dubrawski