Xiong, Chenyan

19 publications

NeurIPS 2025 DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models Cathy Jiao, Yijun Pan, Emily Xiao, Daisy Sheng, Niket Jain, Hanzhang Zhao, Ishita Dasgupta, Jiaqi W. Ma, Chenyan Xiong
NeurIPS 2025 Fairshare Data Pricing via Data Valuation for Large Language Models Luyang Zhang, Cathy Jiao, Beibei Li, Chenyan Xiong
NeurIPS 2025 Group-Level Data Selection for Efficient Pretraining Zichun Yu, Fei Peng, Jie Lei, Arnold Overwijk, Wen-tau Yih, Chenyan Xiong
ICLR 2025 Harnessing Webpage UIs for Text-Rich Visual Understanding Junpeng Liu, Tianyue Ou, Yifan Song, Yuxiao Qu, Wai Lam, Chenyan Xiong, Wenhu Chen, Graham Neubig, Xiang Yue
ICLR 2025 Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning Xiaochuan Li, Zichun Yu, Chenyan Xiong
NeurIPS 2025 ORBIT - Open Recommendation Benchmark for Reproducible Research with Hidden Tests Jingyuan He, Jiongnan Liu, Vishan Vishesh Oberoi, Bolin Wu, Mahima Jagadeesh Patel, Kangrui Mao, Chuning Shi, I-Ta Lee, Arnold Overwijk, Chenyan Xiong
NeurIPS 2025 ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation Pengcheng Huang, Zhenghao Liu, Yukun Yan, Haiyan Zhao, Xiaoyuan Yi, Hao Chen, Zhiyuan Liu, Maosong Sun, Tong Xiao, Ge Yu, Chenyan Xiong
ICLR 2025 RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards Xinze Li, Sen Mei, Zhenghao Liu, Yukun Yan, Shuo Wang, Shi Yu, Zheni Zeng, Hao Chen, Ge Yu, Zhiyuan Liu, Maosong Sun, Chenyan Xiong
ICML 2024 ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance Liwen Sun, Abhineet Agarwal, Aaron Kornblith, Bin Yu, Chenyan Xiong
NeurIPSW 2024 Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation Liwen Sun, James Jialun Zhao, Wenjing Han, Chenyan Xiong
NeurIPS 2024 MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models Zichun Yu, Spandan Das, Chenyan Xiong
ICLR 2023 Universal Vision-Language Dense Retrieval: Learning a Unified Representation Space for Multi-Modal Retrieval Zhenghao Liu, Chenyan Xiong, Yuanhuiyi Lv, Zhiyuan Liu, Ge Yu
ICLR 2022 Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul N. Bennett, Jiawei Han, Xia Song
ICLR 2021 Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, Arnold Overwijk
NeurIPS 2021 COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song
AAAI 2021 Data Augmentation for Abstractive Query-Focused Multi-Document Summarization Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao
AAAI 2020 Latent Relation Language Models Hiroaki Hayashi, Zecong Hu, Chenyan Xiong, Graham Neubig
NeurIPS 2020 Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang
ICLR 2020 Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul Bennett, Saurabh Tiwary