Jiang, Yibo
20 publications
ICML
2025
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)
ICLR
2024
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
NeurIPS
2024
Do LLMs Dream of Elephants (when Told Not to)? Latent Concept Association and Associative Memory in Transformers
NeurIPSW
2024
Do LLMs Dream of Elephants (when Told Not to)? Latent Concept Association and Associative Memory in Transformers
NeurIPSW
2023
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints