ML Anthology
Authors
Search
About
Bhaskar, Adithya
4 publications
ICLR
2025
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
,
Sadhika Malladi
,
Adithya Bhaskar
,
Danqi Chen
,
Sanjeev Arora
,
Boris Hanin
NeurIPS
2024
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
,
Alexander Wettig
,
Dan Friedman
,
Danqi Chen
NeurIPSW
2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
,
Sadhika Malladi
,
Adithya Bhaskar
,
Danqi Chen
,
Sanjeev Arora
,
Boris Hanin
NeurIPSW
2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
,
Sadhika Malladi
,
Adithya Bhaskar
,
Danqi Chen
,
Sanjeev Arora
,
Boris Hanin