Li, Sharon
20 publications
ICLR
2026
LH-DECEPTION: Simulating and Understanding LLM Deceptive Behaviors in Long-Horizon Interactions
TMLR
2026
Unsupervised Domain Adaptation for Binary Classification with an Unobservable Source Subpopulation
NeurIPS
2025
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
NeurIPS
2025
Harnessing Feature Resonance Under Arbitrary Target Alignment for Out-of-Distribution Node Detection