Shekhar, Shivanshu

1 publications

TMLR 2025 SEE-DPO: Self Entropy Enhanced Direct Preference Optimization Shivanshu Shekhar, Shreyas Singh, Tong Zhang