Li, Lihong
67 publications
ICLR
2026
SFT Doesn’t Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
AISTATS
2021
Off-Policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
NeurIPS
2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections