Qi, Zhongang

29 publications

AAAI 2025 CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities Tao Wu, Yong Zhang, Xintao Wang, Xianpan Zhou, Guangcong Zheng, Zhongang Qi, Ying Shan, Xi Li
ICCV 2025 DOGR: Towards Versatile Visual Document Grounding and Referring Yinan Zhou, Yuxin Chen, Haokun Lin, Yichen Wu, Shuyu Yang, Zhongang Qi, Chen Ma, Li Zhu
ICCV 2025 Less Is More: Empowering GUI Agent with Context-Aware Simplification Gongwei Chen, Xurui Zhou, Rui Shao, Yibo Lyu, Kaiwen Zhou, Shuai Wang, Wentao Li, Yinchuan Li, Zhongang Qi, Liqiang Nie
ICCV 2025 Mamba-3VL: Taming State Space Model for 3D Vision Language Learning Yuan Wang, Yuxin Chen, Zhongang Qi, Lijun Liu, Jile Jiao, Xuetao Feng, Yujia Liang, Ying Shan, Zhipeng Zhang
CVPR 2025 Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion Songsong Yu, Yuxin Chen, Zhongang Qi, Zeke Xie, Yifan Wang, Lijun Wang, Ying Shan, Huchuan Lu
ICML 2025 Taming Rectified Flow for Inversion and Editing Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
NeurIPS 2025 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning Ye Liu, Zongyang Ma, Junfu Pu, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen
ICCV 2025 VisionMath: Vision-Form Mathematical Problem-Solving Zongyang Ma, Yuxin Chen, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Shaojie Zhu, Chengxiang Zhuo, Bing Li, Ye Liu, Zang Li, Ying Shan, Weiming Hu
NeurIPS 2024 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding Ye Liu, Zongyang Ma, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen
ECCV 2024 EA-VTR: Event-Aware Video-Text Retrieval Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Chunfeng Yuan, Bing Li, Yingmin Luo, Xu Li, Xiaojuan Qi, Ying Shan, Weiming Hu
CVPR 2024 How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Bing Li, Junfu Pu, Ying Shan, Xiaojuan Qi, Weiming Hu
CVPR 2024 PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, Ying Shan
AAAI 2024 SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li
AAAI 2024 T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, Ying Shan
AAAI 2023 Accelerating the Training of Video Super-Resolution Models Lijian Lin, Xintao Wang, Zhongang Qi, Ying Shan
NeurIPS 2023 Exploiting Contextual Objects and Relations for 3D Visual Grounding Li Yang, Chunfeng Yuan, Ziqi Zhang, Zhongang Qi, Yan Xu, Wei Liu, Ying Shan, Bing Li, Weiping Yang, Peng Li, Yan Wang, Weiming Hu
CVPR 2023 LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li
ICCV 2023 MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng
ICCV 2023 Order-Prompted Tag Sequence Generation for Video Tagging Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Yingmin Luo, Zekun Li, Chunfeng Yuan, Bing Li, Xiaohu Qie, Ying Shan, Weiming Hu
IJCAI 2023 SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation Xuewei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li
AAAI 2023 Tagging Before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval Yizhen Chen, Jie Wang, Lijian Lin, Zhongang Qi, Jin Ma, Ying Shan
CVPR 2023 ViLEM: Visual-Language Error Modeling for Image-Text Retrieval Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Weiming Hu, Xiaohu Qie, Jianping Wu
CVPR 2022 BTS: A Bi-Lingual Benchmark for Text Segmentation in the Wild Xixi Xu, Zhongang Qi, Jianqi Ma, Honglun Zhang, Ying Shan, Xiaohu Qie
NeurIPS 2021 Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution Liangbin Xie, Xintao Wang, Chao Dong, Zhongang Qi, Ying Shan
CVPR 2021 Open-Book Video Captioning with Retrieve-Copy-Generate Network Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu
AAAI 2020 ScaleNet - Improve CNNs Through Recursively Rescaling Objects Xingyi Li, Zhongang Qi, Xiaoli Z. Fern, Fuxin Li
AAAI 2020 Visualizing Deep Networks by Optimizing with Integrated Gradients Zhongang Qi, Saeed Khorram, Fuxin Li
CVPRW 2019 Visualizing Deep Networks by Optimizing with Integrated Gradients Zhongang Qi, Saeed Khorram, Fuxin Li
AAAI 2018 Multi-Task Medical Concept Normalization Using Multi-View Convolutional Neural Network Yi Luo, Guojie Song, Pengyu Li, Zhongang Qi