ML Anthology
Authors
Search
About
Xie, Weidi
59 publications
ICLR
2025
A Sanity Check for AI-Generated Image Detection
Shilin Yan
,
Ouxiang Li
,
Jiayin Cai
,
Yanbin Hao
,
Xiaolong Jiang
,
Yao Hu
,
Weidi Xie
ICLR
2025
EgoExo-Gen: Ego-Centric Video Prediction by Watching Exo-Centric Videos
Jilan Xu
,
Yifei Huang
,
Baoqi Pei
,
Junlin Hou
,
Qingqiu Li
,
Guo Chen
,
Yuejie Zhang
,
Rui Feng
,
Weidi Xie
CVPR
2025
Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation
Yudi Shi
,
Shangzhe Di
,
Qirui Chen
,
Weidi Xie
AAAI
2025
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Qirui Chen
,
Shangzhe Di
,
Weidi Xie
CVPR
2025
LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
Yikun Liu
,
Yajie Zhang
,
Jiayin Cai
,
Xiaolong Jiang
,
Yao Hu
,
Jiangchao Yao
,
Yanfeng Wang
,
Weidi Xie
ICCV
2025
Learning Streaming Video Representation via Multitask Training
Yibin Yan
,
Jilan Xu
,
Shangzhe Di
,
Yikun Liu
,
Yudi Shi
,
Qirui Chen
,
Zeqian Li
,
Yifei Huang
,
Weidi Xie
ICCV
2025
MRGen: Segmentation Data Engine for Underrepresented MRI Modalities
Haoning Wu
,
Ziheng Zhao
,
Ya Zhang
,
Yanfeng Wang
,
Weidi Xie
ICLR
2025
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
,
Yifei Huang
,
Jilan Xu
,
Guo Chen
,
Yuping He
,
Lijin Yang
,
Yali Wang
,
Weidi Xie
,
Yu Qiao
,
Fei Wu
,
Limin Wang
ICCV
2025
Object-Centric Video Question Answering with Visual Grounding and Referring
Haochen Wang
,
Qirui Chen
,
Cilin Yan
,
Jiayin Cai
,
Xiaolong Jiang
,
Yao Hu
,
Weidi Xie
,
Stratis Gavves
ICCV
2025
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie
,
Tengda Han
,
Max Bain
,
Arsha Nagrani
,
Eshika Khandelwal
,
Gül Varol
,
Weidi Xie
,
Andrew Zisserman
CVPR
2025
Towards Universal Soccer Video Understanding
Jiayuan Rao
,
Haoning Wu
,
Hao Jiang
,
Ya Zhang
,
Yanfeng Wang
,
Weidi Xie
ICLR
2025
Track-on: Transformer-Based Online Point Tracking with Memory
Görkay Aydemir
,
Xiongyi Cai
,
Weidi Xie
,
Fatma Guney
NeurIPS
2025
Universal Video Temporal Grounding with Generative Multi-Modal Large Language Models
Zeqian Li
,
Shangzhe Di
,
Zhonghua Zhai
,
Weilin Huang
,
Yanfeng Wang
,
Weidi Xie
NeurIPS
2024
A General Protocol to Probe Large Vision Models for 3D Physical Understanding
Guanqi Zhan
,
Chuanxia Zheng
,
Weidi Xie
,
Andrew Zisserman
CVPR
2024
Amodal Ground Truth and Completion in the Wild
Guanqi Zhan
,
Chuanxia Zheng
,
Weidi Xie
,
Andrew Zisserman
WACV
2024
Annotation-Free Audio-Visual Segmentation
Jinxiang Liu
,
Yu Wang
,
Chen Ju
,
Chaofan Ma
,
Ya Zhang
,
Weidi Xie
ECCV
2024
Appearance-Based Refinement for Object-Centric Motion Segmentation
Junyu Xie
,
Weidi Xie
,
Andrew Zisserman
CVPR
2024
AutoAD III: The Prequel - Back to the Pixels
Tengda Han
,
Max Bain
,
Arsha Nagrani
,
Gül Varol
,
Weidi Xie
,
Andrew Zisserman
CVPR
2024
Grounded Question-Answering in Long Egocentric Videos
Shangzhe Di
,
Weidi Xie
CVPR
2024
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
Chengjian Feng
,
Yujie Zhong
,
Zequn Jie
,
Weidi Xie
,
Lin Ma
CVPR
2024
Intelligent Grimm - Open-Ended Visual Storytelling via Latent Diffusion Models
Chang Liu
,
Haoning Wu
,
Yujie Zhong
,
Xiaoyun Zhang
,
Yanfeng Wang
,
Weidi Xie
ECCV
2024
Knowledge-Enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou
,
Xiaoman Zhang
,
Chaoyi Wu
,
Ya Zhang
,
Weidi Xie
,
Yan-Feng Wang
ECCV
2024
Made to Order: Discovering Monotonic Temporal Changes via Self-Supervised Video Ordering
Charig Yang
,
Weidi Xie
,
Andrew Zisserman
ECCV
2024
Multi-Sentence Grounding for Long-Term Instructional Video
Zeqian Li
,
Qirui Chen
,
Tengda Han
,
Ya Zhang
,
Yan-Feng Wang
,
Weidi Xie
CVPR
2024
Retrieval-Augmented Egocentric Video Captioning
Jilan Xu
,
Yifei Huang
,
Junlin Hou
,
Guo Chen
,
Yuejie Zhang
,
Rui Feng
,
Weidi Xie
ECCV
2024
VISA: Reasoning Video Object Segmentation via Large Language Model
Cilin Yan
,
Haochen Wang
,
Shilin Yan
,
Xiaolong Jiang
,
Yao Hu
,
Guoliang Kang
,
Weidi Xie
,
Efstratios Gavves
ICCV
2023
AutoAD II: The Sequel - Who, When, and What in Movie Audio Description
Tengda Han
,
Max Bain
,
Arsha Nagrani
,
Gul Varol
,
Weidi Xie
,
Andrew Zisserman
CVPR
2023
AutoAD: Movie Description in Context
Tengda Han
,
Max Bain
,
Arsha Nagrani
,
Gül Varol
,
Weidi Xie
,
Andrew Zisserman
CVPRW
2023
Cali-NCE: Boosting Cross-Modal Video Representation Learning with Calibrated Alignment
Nanxuan Zhao
,
Jianbo Jiao
,
Weidi Xie
,
Dahua Lin
CVPR
2023
Collaboration Helps Camera Overtake LiDAR in 3D Detection
Yue Hu
,
Yifan Lu
,
Runsheng Xu
,
Weidi Xie
,
Siheng Chen
,
Yanfeng Wang
ICCV
2023
Joint-Relation Transformer for Multi-Person Motion Prediction
Qingyao Xu
,
Weibo Mao
,
Jingze Gong
,
Chenxin Xu
,
Siheng Chen
,
Weidi Xie
,
Ya Zhang
,
Yanfeng Wang
CVPR
2023
Learning Open-Vocabulary Semantic Segmentation Models from Natural Language Supervision
Jilan Xu
,
Junlin Hou
,
Yuejie Zhang
,
Rui Feng
,
Yi Wang
,
Yu Qiao
,
Weidi Xie
ICCV
2023
MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-Ray Diagnosis
Chaoyi Wu
,
Xiaoman Zhang
,
Ya Zhang
,
Yanfeng Wang
,
Weidi Xie
ICML
2023
Multi-Modal Classifiers for Open-Vocabulary Object Detection
Prannay Kaul
,
Weidi Xie
,
Andrew Zisserman
CVPRW
2023
NamedMask: Distilling Segmenters from Complementary Foundation Models
Gyungin Shin
,
Weidi Xie
,
Samuel Albanie
ICCV
2023
Open-Vocabulary Object Segmentation with Diffusion Models
Ziyi Li
,
Qinye Zhou
,
Xiaoyun Zhang
,
Ya Zhang
,
Yanfeng Wang
,
Weidi Xie
CVPR
2023
OvarNet: Towards Open-Vocabulary Object Attribute Recognition
Keyan Chen
,
Xiaolong Jiang
,
Yao Hu
,
Xu Tang
,
Yan Gao
,
Jianqi Chen
,
Weidi Xie
NeurIPS
2023
Self-Supervised Object-Centric Learning for Videos
Görkay Aydemir
,
Weidi Xie
,
Fatma Guney
ICCV
2023
The Making and Breaking of Camouflage
Hala Lamdouar
,
Weidi Xie
,
Andrew Zisserman
ICCV
2023
Towards Open-Vocabulary Video Instance Segmentation
Haochen Wang
,
Cilin Yan
,
Shuai Wang
,
Xiaolong Jiang
,
Xu Tang
,
Yao Hu
,
Weidi Xie
,
Efstratios Gavves
CVPRW
2023
Zero-Shot Unsupervised Transfer Instance Segmentation
Gyungin Shin
,
Samuel Albanie
,
Weidi Xie
NeurIPSW
2023
arXiVeri: Automatic Table Verification with GPT
Gyungin Shin
,
Weidi Xie
,
Samuel Albanie
NeurIPS
2022
Associating Objects and Their Effects in Video Through Coordination Games
Erika Lu
,
Forrester Cole
,
Weidi Xie
,
Tali Dekel
,
Bill Freeman
,
Andrew Zisserman
,
Michael Rubinstein
CVPR
2022
It's About Time: Analog Clock Reading in the Wild
Charig Yang
,
Weidi Xie
,
Andrew Zisserman
CVPR
2022
Label, Verify, Correct: A Simple Few Shot Object Detection Method
Prannay Kaul
,
Weidi Xie
,
Andrew Zisserman
ECCV
2022
PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images
Chengjian Feng
,
Yujie Zhong
,
Zequn Jie
,
Xiangxiang Chu
,
Haibing Ren
,
Xiaolin Wei
,
Weidi Xie
,
Lin Ma
ECCV
2022
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
,
Tengda Han
,
Kunhao Zheng
,
Ya Zhang
,
Weidi Xie
NeurIPS
2022
ReCo: Retrieve and Co-Segment for Zero-Shot Transfer
Gyungin Shin
,
Weidi Xie
,
Samuel Albanie
NeurIPS
2022
Segmenting Moving Objects via an Object-Centric Layered Representation
Junyu Xie
,
Weidi Xie
,
Andrew Zisserman
CVPR
2022
Temporal Alignment Networks for Long-Term Video
Tengda Han
,
Weidi Xie
,
Andrew Zisserman
CVPRW
2022
Unsupervised Salient Object Detection with Spectral Cluster Voting
Gyungin Shin
,
Samuel Albanie
,
Weidi Xie
ICCVW
2021
All You Need Are a Few Pixels: Semantic Segmentation with PixelPick
Gyungin Shin
,
Weidi Xie
,
Samuel Albanie
CVPR
2021
Localizing Visual Sounds the Hard Way
Honglie Chen
,
Weidi Xie
,
Triantafyllos Afouras
,
Arsha Nagrani
,
Andrea Vedaldi
,
Andrew Zisserman
ICCV
2021
Self-Supervised Video Object Segmentation by Motion Grouping
Charig Yang
,
Hala Lamdouar
,
Erika Lu
,
Andrew Zisserman
,
Weidi Xie
ECCV
2020
Memory-Augmented Dense Predictive Coding for Video Representation Learning
Tengda Han
,
Weidi Xie
,
Andrew Zisserman
NeurIPS
2020
Self-Supervised Co-Training for Video Representation Learning
Tengda Han
,
Weidi Xie
,
Andrew Zisserman
ECCV
2020
Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval
Andrew Brown
,
Weidi Xie
,
Vicky Kalogeiton
,
Andrew Zisserman
ICCVW
2019
Video Representation Learning by Dense Predictive Coding
Tengda Han
,
Weidi Xie
,
Andrew Zisserman
ECCV
2018
Comparator Networks
Weidi Xie
,
Li Shen
,
Andrew Zisserman