Yao, Yuanshun

14 publications

ICLR 2025 ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration Andrew Estornell, Jean-Francois Ton, Yuanshun Yao, Yang Liu
ICLRW 2025 Learning to Watermark LLM-Generated Text via Reinforcement Learning Xiaojun Xu, Yuanshun Yao, Yang Liu
ICML 2025 Robust Multi-Bit Text Watermark with LLM-Based Paraphrasers Xiaojun Xu, Jinghan Jia, Yuanshun Yao, Yang Liu, Hang Li
ICLRW 2025 Robust Multi-Bit Text Watermark with LLM-Based Paraphrasers Xiaojun Xu, Jinghan Jia, Yuanshun Yao, Yang Liu, Hang Li
ICLR 2024 Fair Classifiers That Abstain Without Harm Tongxin Yin, Jean-Francois Ton, Ruocheng Guo, Yuanshun Yao, Mingyan Liu, Yang Liu
NeurIPS 2024 Fairness Without Harm: An Influence-Guided Active Sampling Approach Jinlong Pang, Jialu Wang, Zhaowei Zhu, Yuanshun Yao, Chen Qian, Yang Liu
NeurIPS 2024 Large Language Model Unlearning Yuanshun Yao, Xiaojun Xu, YangLiu
AAAI 2023 DPAUC: Differentially Private AUC Computation in Federated Learning Jiankai Sun, Xin Yang, Yuanshun Yao, Junyuan Xie, Di Wu, Chong Wang
NeurIPSW 2023 Large Language Model Unlearning Yuanshun Yao, Xiaojun Xu, Yang Liu
NeurIPSW 2023 Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models' Alignment Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li
ICML 2023 Weak Proxies Are Sufficient and Preferable for Fairness with Missing Sensitive Attributes Zhaowei Zhu, Yuanshun Yao, Jiankai Sun, Hang Li, Yang Liu
UAI 2022 Differentially Private Multi-Party Data Release for Linear Regression Ruihan Wu, Xin Yang, Yuanshun Yao, Jiankai Sun, Tianyi Liu, Q. Kilian Weinberger, Chong Wang
NeurIPSW 2022 Netflix and Forget: Fast Severance from Memorizing Training Data in Recommendations Mimee Xu, Jiankai Sun, Xin Yang, Yuanshun Yao, Chong Wang
CVPR 2021 Backdoor Attacks Against Deep Learning Systems in the Physical World Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao