Zong, Zhuofan

15 publications

ICLR 2026 DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving Yang Zhou, Hao Shao, Letian Wang, Zhuofan Zong, Hongsheng Li, Steven L. Waslander
ICLR 2026 WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning Zimu Lu, Houxing Ren, Yunqiao Yang, Ke Wang, Zhuofan Zong, Junting Pan, Mingjie Zhan, Hongsheng Li
ICML 2025 EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Zhuofan Zong, Dongzhi Jiang, Bingqi Ma, Guanglu Song, Hao Shao, Dazhong Shen, Yu Liu, Hongsheng Li
NeurIPS 2025 T2I-R1: Reinforcing Image Generation with Collaborative Semantic-Level and Token-Level CoT Dongzhi Jiang, Ziyu Guo, Renrui Zhang, Zhuofan Zong, Hao Li, Le Zhuo, Shilin Yan, Pheng-Ann Heng, Hongsheng Li
NeurIPS 2025 VividFace: A Robost and High-Fidelity Video Face Swapping Framework Hao Shao, Shulun Wang, Yang Zhou, Guanglu Song, Dailan He, Zhuofan Zong, Shuo Qin, Yu Liu, Hongsheng Li
NeurIPS 2024 CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li
NeurIPS 2024 Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Bingqi Ma, Zhuofan Zong, Guanglu Song, Hongsheng Li, Yu Liu
NeurIPS 2024 MoVA: Adapting Mixture of Vision Experts to Multimodal Context Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu
NeurIPS 2024 Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li
ICCV 2023 DETRs with Collaborative Hybrid Assignments Training Zhuofan Zong, Guanglu Song, Yu Liu
NeurIPS 2023 RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu, Ping Luo
ICCV 2023 Temporal Enhanced Training of Multi-View 3D Object Detector via Historical Object Prediction Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li, Yu Liu
NeurIPS 2022 Large-Batch Optimization for Dense Visual Predictions: Training Faster R-CNN in 4.2 Minutes Zeyue Xue, Jianming Liang, Guanglu Song, Zhuofan Zong, Liang Chen, Yu Liu, Ping Luo
ECCV 2022 Self-Slimmed Vision Transformer Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
AAAI 2020 Graph Attention Based Proposal 3D ConvNets for Action Detection Jin Li, Xianglong Liu, Zhuofan Zong, Wanru Zhao, Mingyuan Zhang, Jingkuan Song