ML Anthology
Authors
Search
About
Li, Aaron Jiaxun
2 publications
ICLR
2025
More RLHF, More Trust? on the Impact of Preference Alignment on Trustworthiness
Aaron Jiaxun Li
,
Satyapriya Krishna
,
Himabindu Lakkaraju
ICML
2024
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining
Aaron Jiaxun Li
,
Robin Netzorg
,
Zhihan Cheng
,
Zhuoqin Zhang
,
Bin Yu