Li, Aaron Jiaxun

2 publications

ICLR 2025 More RLHF, More Trust? on the Impact of Preference Alignment on Trustworthiness Aaron Jiaxun Li, Satyapriya Krishna, Himabindu Lakkaraju
ICML 2024 Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining Aaron Jiaxun Li, Robin Netzorg, Zhihan Cheng, Zhuoqin Zhang, Bin Yu