Sun, Yanchao
33 publications
ICLR
2025
TIS-DPO: Token-Level Importance Sampling for Direct Preference Optimization with Estimated Weights
ICLR
2024
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
ICLR
2024
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL
NeurIPS
2023
$\texttt{TACO}$: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning
NeurIPSW
2023
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
ICMLW
2023
Equal Long-Term Benefit Rate: Adapting Static Fairness Notions to Sequential Decision Making
NeurIPSW
2023
O3D: Offline Data-Driven Discovery and Distillation for Sequential Decision-Making with Large Language Models
NeurIPS
2022
Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach