Xu, Yuancheng
19 publications
AAAI
2025
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
ICML
2025
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
ICMLW
2024
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
ICMLW
2024
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
NeurIPSW
2024
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
NeurIPS
2023
C-Disentanglement: Discovering Causally-Independent Generative Factors Under an Inductive Bias of Confounder