Zhang, Pan

34 publications

ICCV 2025 Bootstrap3D: Improving Multi-View Diffusion Model with Synthetic Data Zeyi Sun, Tong Wu, Pan Zhang, Yuhang Zang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang
CVPR 2025 ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-Free Way Jiazi Bu, Pengyang Ling, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
CVPR 2025 Conical Visual Concentration for Efficient Large Vision-Language Models Long Xing, Qidong Huang, Xiaoyi Dong, Jiajie Lu, Pan Zhang, Yuhang Zang, Yuhang Cao, Conghui He, Jiaqi Wang, Feng Wu, Dahua Lin
ICCV 2025 Deciphering Cross-Modal Alignment in Large Vision-Language Models via Modality Integration Rate Qidong Huang, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Jiaqi Wang, Weiming Zhang, Nenghai Yu
CVPR 2025 Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction Rui Qian, Shuangrui Ding, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
NeurIPS 2025 HiFlow: Training-Free High-Resolution Image Generation with Flow-Aligned Guidance Jiazi Bu, Pengyang Ling, Yujie Zhou, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
ICCV 2025 Light-a-Video: Training-Free Video Relighting via Progressive Light Fusion Yujie Zhou, Jiazi Bu, Pengyang Ling, Pan Zhang, Tong Wu, Qidong Huang, Jinsong Li, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Anyi Rao, Jiaqi Wang, Li Niu
ICLR 2025 MIA-DPO: Multi-Image Augmented Direct Preference Optimization for Large Vision-Language Models Ziyu Liu, Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Haodong Duan, Conghui He, Yuanjun Xiong, Dahua Lin, Jiaqi Wang
ICCV 2025 MM-IFEngine: Towards Multimodal Instruction Following Shengyuan Ding, Shenxi Wu, Xiangyu Zhao, Yuhang Zang, Haodong Duan, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Dahua Lin, Jiaqi Wang
ICLR 2025 MotionClone: Training-Free Motion Cloning for Controllable Video Generation Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin
CVPR 2025 OVO-Bench: How Far Is Your Video-LLMs from Real-World Online Video Understanding? Junbo Niu, Yifei Li, Ziyang Miao, Chunjiang Ge, Yuanhang Zhou, Qihao He, Xiaoyi Dong, Haodong Duan, Shuangrui Ding, Rui Qian, Pan Zhang, Yuhang Zang, Yuhang Cao, Conghui He, Jiaqi Wang
ICCV 2025 SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Shuangrui Ding, Rui Qian, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Yuwei Guo, Dahua Lin, Jiaqi Wang
ICML 2025 SongGen: A Single Stage Auto-Regressive Transformer for Text-to-Song Generation Zihan Liu, Shuangrui Ding, Zhixiong Zhang, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
ICML 2025 VideoRoPE: What Makes for Good Video Rotary Position Embedding? Xilin Wei, Xiaoran Liu, Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Jian Tong, Haodong Duan, Qipeng Guo, Jiaqi Wang, Xipeng Qiu, Dahua Lin
ICCV 2025 X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting Zeyi Sun, Ziyang Chu, Pan Zhang, Tong Wu, Yuhang Zang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang
CVPR 2024 Alpha-CLIP: A CLIP Model Focusing on Wherever You Want Zeyi Sun, Ye Fang, Tong Wu, Pan Zhang, Yuhang Zang, Shu Kong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang
NeurIPS 2024 Are We on the Right Way for Evaluating Large Vision-Language Models? Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao
CVPR 2024 FreeDrag: Feature Dragging for Reliable Point-Based Image Editing Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, Jinjin Zheng
NeurIPS 2024 InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4k HD Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Songyang Zhang, Haodong Duan, Wenwei Zhang, Yining Li, Hang Yan, Yang Gao, Zhe Chen, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Kai Chen, Conghui He, Xingcheng Zhang, Jifeng Dai, Yu Qiao, Dahua Lin, Jiaqi Wang
ECCV 2024 Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Jiaqi Wang
NeurIPS 2024 MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs Ziyu Liu, Tao Chu, Yuhang Zang, Xilin Wei, Xiaoyi Dong, Pan Zhang, Zijian Liang, Yuanjun Xiong, Yu Qiao, Dahua Lin, Jiaqi Wang
NeurIPS 2024 MMLONGBENCH-DOC: Benchmarking Long-Context Document Understanding with Visualizations Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun
CVPR 2024 OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Conghui He, Jiaqi Wang, Dahua Lin, Weiming Zhang, Nenghai Yu
ECCV 2024 ShareGPT4V: Improving Large Multi-Modal Models with Better Captions Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin
NeurIPS 2024 ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Lin Chen, Xilin Wei, Jinsong Li, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Bin Lin, Zhenyu Tang, Li Yuan, Yu Qiao, Dahua Lin, Feng Zhao, Jiaqi Wang
NeurIPS 2024 Streaming Long Video Understanding with Large Language Models Rui Qian, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Shuangrui Ding, Dahua Lin, Jiaqi Wang
AAAI 2024 VIGC: Visual Instruction Generation and Correction Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He
CVPR 2023 BUOL: A Bottom-up Framework with Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction from a Single Image Tao Chu, Pan Zhang, Qiong Liu, Jiaqi Wang
CVPR 2023 MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation Bowen Zhang, Chenyang Qi, Pan Zhang, Bo Zhang, HsiangTao Wu, Dong Chen, Qifeng Chen, Yong Wang, Fang Wen
ICCV 2023 V3Det: Vast Vocabulary Visual Detection Dataset Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin
ECCV 2022 Real-Time Neural Character Rendering with Pose-Guided Multiplane Images Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen
CVPR 2021 CoCosNet V2: Full-Resolution Correspondence Learning for Image Translation Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen
CVPR 2021 Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, Fang Wen
NeurIPS 2016 Robust Spectral Detection of Global Structures in the Data by Learning a Regularization Pan Zhang