Xie, Weidi

59 publications

ICLR 2025 A Sanity Check for AI-Generated Image Detection Shilin Yan, Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, Weidi Xie
ICLR 2025 EgoExo-Gen: Ego-Centric Video Prediction by Watching Exo-Centric Videos Jilan Xu, Yifei Huang, Baoqi Pei, Junlin Hou, Qingqiu Li, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie
CVPR 2025 Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation Yudi Shi, Shangzhe Di, Qirui Chen, Weidi Xie
AAAI 2025 Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos Qirui Chen, Shangzhe Di, Weidi Xie
CVPR 2025 LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant Yikun Liu, Yajie Zhang, Jiayin Cai, Xiaolong Jiang, Yao Hu, Jiangchao Yao, Yanfeng Wang, Weidi Xie
ICCV 2025 Learning Streaming Video Representation via Multitask Training Yibin Yan, Jilan Xu, Shangzhe Di, Yikun Liu, Yudi Shi, Qirui Chen, Zeqian Li, Yifei Huang, Weidi Xie
ICCV 2025 MRGen: Segmentation Data Engine for Underrepresented MRI Modalities Haoning Wu, Ziheng Zhao, Ya Zhang, Yanfeng Wang, Weidi Xie
ICLR 2025 Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning Baoqi Pei, Yifei Huang, Jilan Xu, Guo Chen, Yuping He, Lijin Yang, Yali Wang, Weidi Xie, Yu Qiao, Fei Wu, Limin Wang
ICCV 2025 Object-Centric Video Question Answering with Visual Grounding and Referring Haochen Wang, Qirui Chen, Cilin Yan, Jiayin Cai, Xiaolong Jiang, Yao Hu, Weidi Xie, Stratis Gavves
ICCV 2025 Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Eshika Khandelwal, Gül Varol, Weidi Xie, Andrew Zisserman
CVPR 2025 Towards Universal Soccer Video Understanding Jiayuan Rao, Haoning Wu, Hao Jiang, Ya Zhang, Yanfeng Wang, Weidi Xie
ICLR 2025 Track-on: Transformer-Based Online Point Tracking with Memory Görkay Aydemir, Xiongyi Cai, Weidi Xie, Fatma Guney
NeurIPS 2025 Universal Video Temporal Grounding with Generative Multi-Modal Large Language Models Zeqian Li, Shangzhe Di, Zhonghua Zhai, Weilin Huang, Yanfeng Wang, Weidi Xie
NeurIPS 2024 A General Protocol to Probe Large Vision Models for 3D Physical Understanding Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
CVPR 2024 Amodal Ground Truth and Completion in the Wild Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
WACV 2024 Annotation-Free Audio-Visual Segmentation Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya Zhang, Weidi Xie
ECCV 2024 Appearance-Based Refinement for Object-Centric Motion Segmentation Junyu Xie, Weidi Xie, Andrew Zisserman
CVPR 2024 AutoAD III: The Prequel - Back to the Pixels Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
CVPR 2024 Grounded Question-Answering in Long Egocentric Videos Shangzhe Di, Weidi Xie
CVPR 2024 InstaGen: Enhancing Object Detection by Training on Synthetic Dataset Chengjian Feng, Yujie Zhong, Zequn Jie, Weidi Xie, Lin Ma
CVPR 2024 Intelligent Grimm - Open-Ended Visual Storytelling via Latent Diffusion Models Chang Liu, Haoning Wu, Yujie Zhong, Xiaoyun Zhang, Yanfeng Wang, Weidi Xie
ECCV 2024 Knowledge-Enhanced Visual-Language Pretraining for Computational Pathology Xiao Zhou, Xiaoman Zhang, Chaoyi Wu, Ya Zhang, Weidi Xie, Yan-Feng Wang
ECCV 2024 Made to Order: Discovering Monotonic Temporal Changes via Self-Supervised Video Ordering Charig Yang, Weidi Xie, Andrew Zisserman
ECCV 2024 Multi-Sentence Grounding for Long-Term Instructional Video Zeqian Li, Qirui Chen, Tengda Han, Ya Zhang, Yan-Feng Wang, Weidi Xie
CVPR 2024 Retrieval-Augmented Egocentric Video Captioning Jilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie
ECCV 2024 VISA: Reasoning Video Object Segmentation via Large Language Model Cilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang, Weidi Xie, Efstratios Gavves
ICCV 2023 AutoAD II: The Sequel - Who, When, and What in Movie Audio Description Tengda Han, Max Bain, Arsha Nagrani, Gul Varol, Weidi Xie, Andrew Zisserman
CVPR 2023 AutoAD: Movie Description in Context Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
CVPRW 2023 Cali-NCE: Boosting Cross-Modal Video Representation Learning with Calibrated Alignment Nanxuan Zhao, Jianbo Jiao, Weidi Xie, Dahua Lin
CVPR 2023 Collaboration Helps Camera Overtake LiDAR in 3D Detection Yue Hu, Yifan Lu, Runsheng Xu, Weidi Xie, Siheng Chen, Yanfeng Wang
ICCV 2023 Joint-Relation Transformer for Multi-Person Motion Prediction Qingyao Xu, Weibo Mao, Jingze Gong, Chenxin Xu, Siheng Chen, Weidi Xie, Ya Zhang, Yanfeng Wang
CVPR 2023 Learning Open-Vocabulary Semantic Segmentation Models from Natural Language Supervision Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Yi Wang, Yu Qiao, Weidi Xie
ICCV 2023 MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-Ray Diagnosis Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
ICML 2023 Multi-Modal Classifiers for Open-Vocabulary Object Detection Prannay Kaul, Weidi Xie, Andrew Zisserman
CVPRW 2023 NamedMask: Distilling Segmenters from Complementary Foundation Models Gyungin Shin, Weidi Xie, Samuel Albanie
ICCV 2023 Open-Vocabulary Object Segmentation with Diffusion Models Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
CVPR 2023 OvarNet: Towards Open-Vocabulary Object Attribute Recognition Keyan Chen, Xiaolong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie
NeurIPS 2023 Self-Supervised Object-Centric Learning for Videos Görkay Aydemir, Weidi Xie, Fatma Guney
ICCV 2023 The Making and Breaking of Camouflage Hala Lamdouar, Weidi Xie, Andrew Zisserman
ICCV 2023 Towards Open-Vocabulary Video Instance Segmentation Haochen Wang, Cilin Yan, Shuai Wang, Xiaolong Jiang, Xu Tang, Yao Hu, Weidi Xie, Efstratios Gavves
CVPRW 2023 Zero-Shot Unsupervised Transfer Instance Segmentation Gyungin Shin, Samuel Albanie, Weidi Xie
NeurIPSW 2023 arXiVeri: Automatic Table Verification with GPT Gyungin Shin, Weidi Xie, Samuel Albanie
NeurIPS 2022 Associating Objects and Their Effects in Video Through Coordination Games Erika Lu, Forrester Cole, Weidi Xie, Tali Dekel, Bill Freeman, Andrew Zisserman, Michael Rubinstein
CVPR 2022 It's About Time: Analog Clock Reading in the Wild Charig Yang, Weidi Xie, Andrew Zisserman
CVPR 2022 Label, Verify, Correct: A Simple Few Shot Object Detection Method Prannay Kaul, Weidi Xie, Andrew Zisserman
ECCV 2022 PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma
ECCV 2022 Prompting Visual-Language Models for Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
NeurIPS 2022 ReCo: Retrieve and Co-Segment for Zero-Shot Transfer Gyungin Shin, Weidi Xie, Samuel Albanie
NeurIPS 2022 Segmenting Moving Objects via an Object-Centric Layered Representation Junyu Xie, Weidi Xie, Andrew Zisserman
CVPR 2022 Temporal Alignment Networks for Long-Term Video Tengda Han, Weidi Xie, Andrew Zisserman
CVPRW 2022 Unsupervised Salient Object Detection with Spectral Cluster Voting Gyungin Shin, Samuel Albanie, Weidi Xie
ICCVW 2021 All You Need Are a Few Pixels: Semantic Segmentation with PixelPick Gyungin Shin, Weidi Xie, Samuel Albanie
CVPR 2021 Localizing Visual Sounds the Hard Way Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
ICCV 2021 Self-Supervised Video Object Segmentation by Motion Grouping Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, Weidi Xie
ECCV 2020 Memory-Augmented Dense Predictive Coding for Video Representation Learning Tengda Han, Weidi Xie, Andrew Zisserman
NeurIPS 2020 Self-Supervised Co-Training for Video Representation Learning Tengda Han, Weidi Xie, Andrew Zisserman
ECCV 2020 Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval Andrew Brown, Weidi Xie, Vicky Kalogeiton, Andrew Zisserman
ICCVW 2019 Video Representation Learning by Dense Predictive Coding Tengda Han, Weidi Xie, Andrew Zisserman
ECCV 2018 Comparator Networks Weidi Xie, Li Shen, Andrew Zisserman