Xiao, Yisong

1 publications

NeurIPS 2025 Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing Yisong Xiao, Aishan Liu, Siyuan Liang, Zonghao Ying, Xianglong Liu, Dacheng Tao