Tang, Chengfu

2 publications

AAAI 2025 DMT-RoleBench: A Dynamic Multi-Turn Dialogue Based Benchmark for Role-Playing Evaluation of Large Language Model and Agent Dingbo Yuan, Yipeng Chen, Guodong Liu, Chenchen Li, Chengfu Tang, Dongxu Zhang, Zhenkui Wang, Xudong Wang, Song Liu
ECML-PKDD 2024 A Merge Sort Based Ranking System for the Evaluation of Large Language Models Chenchen Li, Linfeng Shi, Chunyi Zhou, Zhaoxin Huan, Chengfu Tang, Xiaolu Zhang, Xudong Wang, Jun Zhou, Song Liu