ECCV 2024

2387 papers

∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions Minh-Quan Le, Alexandros Graikos, Srikar Yellapragada, Rajarsi Gupta, Joel Saltz, Dimitris Samaras
PDF
2S-ODIS: Two-Stage Omni-Directional Image Synthesis by Geometric Distortion Correction Atsuya Nakata, Takao Yamanaka
PDF
3D Congealing: 3D-Aware Image Alignment in the Wild Yunzhi Zhang, Zizhang Li, Amit Raj, Andreas Engelhardt, Yuanzhen Li, Tingbo Hou, Jiajun Wu, Varun Jampani
PDF
3D Gaussian Parametric Head Model Yuelang Xu, Lizhen Wang, Zerong Zheng, Zhaoqi Su, Yebin Liu
PDF
3D Hand Pose Estimation in Everyday Egocentric Images Aditya Prakash, Ruisen Tu, Matthew Chang, Saurabh Gupta
PDF
3D Hand Sequence Recovery from Real Blurry Images and Event Stream JoonKyu Park, Gyeongsik Moon, Weipeng Xu, Evan Kaseman, Takaaki Shiratori, Kyoung Mu Lee
PDF
3D Human Pose Estimation via Non-Causal Retentive Networks Kaili Zheng, Feixiang Lu, Yihao Lv, Liangjun Zhang, Chenyi Guo, Ji Wu
PDF
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation Zihao Xiao, Longlong Jing, Shangxuan Wu, Alex Zihao Zhu, Jingwei Ji, Chiyu Max Jiang, Wei-Chih Hung, Thomas Funkhouser, Weicheng Kuo, Anelia Angelova, Yin Zhou, Shiwei Sheng
PDF
3D Reconstruction of Objects in Hands Without Real World 3D Supervision Aditya Prakash, Matthew Chang, Matthew Jin, Ruisen Tu, Saurabh Gupta
PDF
3D Single-Object Tracking in Point Clouds with High Temporal Variation Qiao Wu, Kun Sun, Pei An, Mathieu Salzmann, Yanning Zhang, Jiaqi Yang
PDF
3D Small Object Detection with Dynamic Spatial Pruning Zhihao Sun, Ziwei Wang, Hongmin Liu, Jie Zhou, Jiwen Lu, Xiuwei Xu
PDF
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance Xiaoxu Xu, Yitian Yuan, Jinlong Li, Qiudan Zhang, Zequn Jie, Lin Ma, Hao Tang, Nicu Sebe, Xu Wang
PDF
3D-GOI: 3D GAN Omni-Inversion for Multifaceted and Multi-Object Editing Haoran Li, Long Ma, Haolin Shi, Yanbin Hao, Yong Liao, Lechao Cheng, Peng Yuan Zhou
PDF
3DEgo: 3D Editing on the Go! Umar Khalid, Hasan Iqbal, Azib Farooq, Jing Hua, Chen Chen
PDF
3DFG-PIFu: 3D Feature Grids for Human Digitization from Sparse Views Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin
PDF
3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng, Michail C Doukas, Jia Guo, Stefanos Zafeiriou
PDF
3DSA:Multi-View 3D Human Pose Estimation with 3D Space Attention Mechanisms Po Han Chen, Chia-Chi Tsai
PDF
3iGS: Factorised Tensorial Illumination for 3D Gaussian Splatting Zhe Jun Tang, Tat-Jen Cham
PDF
3R-INN: How to Be Climate Friendly While Consuming/delivering Videos? Zoubida Ameur, Claire-Helene Demarty, Olivier Le Meur, Daniel Menard
PDF
3x2: 3D Object Part Segmentation by 2D Semantic Correspondences Anh Thai, Weiyao Wang, Hao Tang, Stefan Stojanov, James M Rehg, Matt Feiszli
PDF
4D Contrastive Superflows Are Dense 3D Representation Learners Xiang Xu, Lingdong Kong, Hui Shuai, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Qingshan Liu
PDF
4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation Feng Cheng, Mi Luo, Huiyu Wang, Alex Dimakis, Lorenzo Torresani, Gedas Bertasius, Kristen Grauman
PDF
6DGS: 6d Pose Estimation from a Single Image and a 3D Gaussian Splatting Model Matteo Bortolon, Theodore Tsesmelis, Stuart James, Fabio Poiesi, Alessio Del Bue
PDF
6DoF Head Pose Estimation Through Explicit Bidirectional Interaction with Face Geometry Sungho Chun, Ju Yong Chang
PDF
A Cephalometric Landmark Regression Method Based on Dual-Encoder for High-Resolution X-Ray Image Chao Dai, Yang Wang, Chaolin Huang, Zhou Jiakai, Qilin Xu, Minpeng Xu
PDF
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks Yixiang Qiu, Hao Fang, Hongyao Yu, Bin Chen, Meikang Qiu, Shu-Tao Xia
PDF
A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis Kai Katsumata, Duc Minh Vo, Hideki Nakayama
PDF
A Comparative Study of Image Restoration Networks for General Backbone Network Design Xiangyu Chen, Zheyuan Li, Yuandong Pu, Yihao Liu, Jiantao Zhou, Yu Qiao, Chao Dong
PDF
A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment Tianhe Wu, Kede Ma, Jie Liang, Yujiu Yang, Lei Zhang
PDF
A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-Skeletal Control Karim Kadry, Shreya Gupta, Jonas Sogbadji, Michiel Schaap, Kersten Petersen, Takuya Mizukami, Carlos Collet, Farhad R. Nezami, Elazer R Edelman
PDF
A Direct Approach to Viewing Graph Solvability Federica Arrigoni, Andrea Fusiello, Tomas Pajdla
PDF
A Fair Ranking and New Model for Panoptic Scene Graph Generation Julian Lorenz, Alexander Pest, Daniel Kienzle, Katja Ludwig, Rainer Lienhart
PDF
A Framework for Efficient Model Evaluation Through Stratification, Sampling, and Estimation Riccardo Fogliato, Pratik Patil, Mathew Monfort, Pietro Perona
PDF
A Geometric Distortion Immunized Deep Watermarking Framework with Robustness Generalizability Linfeng Ma, Han Fang, Tianyi Wei, Zijin Yang, Zehua Ma, Weiming Zhang, Nenghai Yu
PDF
A Graph-Based Approach for Category-Agnostic Pose Estimation Or Hirschorn, Shai Avidan
PDF
A High-Quality Robust Diffusion Framework for Corrupted Dataset Quan Dao, Binh Ta, Tung Pham, Anh Tran
PDF
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis Xiang Liu, Zhaoxiang Liu, Huan Hu, Zezhou Chen, Kohou Wang, Kai Wang, Shiguo Lian
PDF
A New Dataset and Framework for Real-World Blurred Images Super-Resolution Rui Qin, Ming Sun, Chao Zhou, Bin Wang
PDF
A Probability-Guided Sampler for Neural Implicit Surface Rendering Gonçalo José Dias Pais, Valter André Piedade, Moitreya Chatterjee, Marcus Greiff, Pedro Miraldo
PDF
A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-Shaped Structures Tahmina Khanam, Mohammed Bennamoun, Guan Wang, Guanjin Wang, Ferdous Sohel, Farid Boussaid, Anuj Srivastava, Hamid Laga
PDF
A Rotation-Invariant Texture ViT for Fine-Grained Recognition of Esophageal Cancer Endoscopic Ultrasound Images Tianyi Liu, Shuaishuai S Zhuang, Jiacheng Nie, Geng Chen, Yusheng Guo, Guangquan Zhou, Jean-Louis Coatrieux, Yang Chen
PDF
A Secure Image Watermarking Framework with Statistical Guarantees via Adversarial Attacks on Secret Key Networks Feiyu Chen, Wei Lin, Ziquan Liu, Antoni Chan
PDF
A Semantic Space Is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties Junfei Xiao, Ziqi Zhou, Wenxuan Li, Shiyi Lan, Jieru Mei, Zhiding Yu, Bingchen Zhao, Alan Yuille, Yuyin Zhou, Cihang Xie
PDF
A Simple Background Augmentation Method for Object Detection with Diffusion Model Yuhang Li, Xin Dong, Chen Chen, Weiming Zhuang, Lingjuan Lyu
PDF
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong
PDF
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting Wouter Van Gansbeke, Bert De Brabandere
PDF
A Simple Low-Bit Quantization Framework for Video Snapshot Compressive Imaging Miao Cao, Lishun Wang, Huan Wang, Xin Yuan
PDF
A Task Is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting Junhao Zhuang, Yanhong Zeng, Wenran Liu, Chun Yuan, Kai Chen
PDF
A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization Qiyu Chen, Huiyuan Luo, Chengkan Lv, Zhengtao Zhang
PDF
A Unified Image Compression Method for Human Perception and Multiple Vision Tasks Sha Guo, Lin Sui, Chen-Lin Zhang, Zhuo Chen, Wenhan Yang, Lingyu Duan
PDF
A Watermark-Conditioned Diffusion Model for IP Protection Rui Min, Sen Li, Hongyang Chen, Minhao Cheng
PDF
ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-Agnostic Counting Michael A Hobley, Victor Adrian Prisacariu
PDF
AccDiffusion: An Accurate Method for Higher-Resolution Image Generation Zhihang Lin, Mingbao Lin, Meng Zhao, Rongrong Ji
PDF
Accelerating Image Generation with Sub-Path Linear Approximation Model Chen Xu, Tianhui Song, Weixin Feng, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang
PDF
Accelerating Image Super-Resolution Networks with Pixel-Level Classification Jinho Jeong, Jinwoo Kim, Younghyun Jo, Seon Joo Kim
PDF
Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic
PDF
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman
PDF
ActionSwitch: Class-Agnostic Detection of Simultaneous Actions in Streaming Videos Hyolim Kang, Jeongseok Hyun, Joungbin An, Youngjae Yu, Seon Joo Kim
PDF
ActionVOS: Actions as Prompts for Video Object Segmentation Liangyang Ouyang, Ruicong Liu, Yifei Huang, Ryosuke Furuta, Yoichi Sato
PDF
Active Coarse-to-Fine Segmentation of Moveable Parts from Real Images Ruiqi Wang, Akshay Gadi Patil, Fenggen Yu, Hao Zhang
PDF
Active Generation for Image Classification Tao Huang, Jiaqi Liu, Shan You, Chang Xu
PDF
AD3: Introducing a Score for Anomaly Detection Dataset Difficulty Assessment Using VIADUCT Dataset Jan D Lehr, Jan H Philipps, Alik Sargsyan, Martin Pape, Jörg Krüger
PDF
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection Yunkang Cao, Jiangning Zhang, Luca Frittoli, Yuqi Cheng, Weiming Shen, Giacomo Boracchi
PDF
AdaDiff: Accelerating Diffusion Models Through Step-Wise Adaptive Computation Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu
PDF
AdaDiffSR: Adaptive Region-Aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution Yuanting Fan, Chengxu Liu, Nengzhong Yin, Changlong Gao, Xueming Qian
PDF
AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition Fadi Boutros, Vitomir Struc, Naser Damer
PDF
AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale Adam Pardyl, Michał Wronka, Maciej Wołczyk, Kamil Adamczewski, Tomasz Trzcinski, Bartosz Zieliński
PDF
AdaIFL: Adaptive Image Forgery Localization via a Dynamic and Importance-Aware Transformer Network Yuxi Li, Fuyuan Cheng, Wangbo Yu, Guangshuo Wang, Guibo Luo, Yuesheng Zhu
PDF
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer Zhuguanyu Wu, Jiaxin Chen, Hanwen Zhong, Di Huang, Yunhong Wang
PDF
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation Zanlin Ni, Yulin Wang, Renping Zhou, Rui Lu, Jiayi Guo, Jinyi Hu, Zhiyuan Liu, Yuan Yao, Gao Huang
PDF
Adapt Without Forgetting: Distill Proximity from Dual Teachers in Vision-Language Models Mengyu Zheng, Yehui Tang, Zhiwei Hao, Kai Han, Yunhe Wang, Chang Xu
PDF
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts Yanting Yang, Minghao Chen, Qibo Qiu, Jiahao Wu, Wenxiao Wang, Binbin Lin, Ziyu Guan, Xiaofei He
PDF
Adapting Fine-Grained Cross-View Localization to Areas Without Fine Ground Truth Zimin Xia, Yujiao Shi, Hongdong Li, Julian F. P. Kooij
PDF
Adapting to Shifting Correlations with Unlabeled Data Calibration Minh Nguyen, Alan Q Wang, Heejong Kim, Mert Sabuncu
PDF
Adaptive Annealing for Robust Averaging Sidhartha Chitturi, Venu Madhav Govindu
PDF
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann, Eric Nalisnick
PDF
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling Noam Elata, Tomer Michaeli, Michael Elad
PDF
Adaptive Correspondence Scoring for Unsupervised Medical Image Registration Xiaoran Zhang, John C. Stendahl, Lawrence H. Staib, Albert J. Sinusas, Alex Wong, James S. Duncan
PDF
Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification Chenyue Li, Shuoyi Chen, Mang Ye
PDF
Adaptive Human Trajectory Prediction via Latent Corridors Neerja Thakkar, Karttikeya Mangalam, Andrea Bajcsy, Jitendra Malik
PDF
Adaptive Multi-Head Contrastive Learning Lei Wang, Piotr Koniusz, Tom Gedeon, Liang Zheng
PDF
Adaptive Multi-Modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haoran Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang
PDF
Adaptive Multi-Task Learning for Few-Shot Object Detection Yan Ren, Yanling Li, Adams Wai-Kin Kong
PDF
Adaptive Parametric Activation Konstantinos P Alexandridis, Jiankang Deng, Anh Nguyen, Shan Luo
PDF
Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing Seongmin Hong, Jaehyeok Bae, Jongho Lee, Se Young Chun
PDF
AdaShield: Safeguarding Multimodal Large Language Models from Structure-Based Attack via Adaptive Shield Prompting Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao
PDF
AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale Keenon Werling, Janelle M Kaneda, Tian Tan, Rishi Agarwal, Six Skov, Tom Van Wouwe, Scott Uhlrich, Scott Delp, Karen Liu, Nicholas A Bianco, Carmichael Ong, Antoine Falisse, Shardul Sapkota, Aidan Jai Chandra, Joshua A Carter, Ezio Preatoni, Benjamin J Fregly, Jennifer Hicks
PDF
AddMe: Zero-Shot Group-Photo Synthesis by Inserting People into Scenes Dongxu Yue, Maomao Li, Yunfei Liu, Ailing Zeng, Tianyu Yang, Qin Guo, Yu Li
PDF
AddressCLIP: Empowering Vision-Language Models for City-Wide Image Address Localization Shixiong Xu, Chenghao Zhang, Lubin Fan, Gaofeng Meng, Shiming Xiang, Jieping Ye
PDF
ADen: Adaptive Density Representations for Sparse-View Camera Pose Estimation Hao Tang, Weiyao Wang, Pierre Gleize, Matt Feiszli
PDF
ADMap: Anti-Disturbance Framework for Vectorized HD mAP Construction Haotian Hu, Fanyi Wang, Yaonong Wang, Laifeng Hu, Jingwei Xu, Zhiwang Zhang
PDF
AdvDiff: Generating Unrestricted Adversarial Examples Using Diffusion Models Xuelong Dai, Kaisheng Liang, Bin Xiao
PDF
Adversarial Diffusion Distillation Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach
PDF
Adversarial Prompt Tuning for Vision-Language Models Jiaming Zhang, Xingjun Ma, Xin Wang, Lingyu Qiu, Jiaqi Wang, Yu-Gang Jiang, Jitao Sang
PDF
Adversarial Robustification via Text-to-Image Diffusion Models Daewon Choi, Jongheon Jeong, Huiwon Jang, Jinwoo Shin
PDF
AdversariaLeak: External Information Leakage Attack Using Adversarial Samples on Face Recognition Systems Roye Katzav, Amit Giloni, Edita Grolman, Hiroo Saito, Tomoyuki Shibata, Tsukasa Omino, Misaki Komatsu, Yoshikazu Hanatani, Yuval Elovici, Asaf Shabtai
PDF
Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap Junhao Dong, Piotr Koniusz, Junxi Chen, Yew-Soon Ong
PDF
AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo, Farid Boussaid, Mohammed Bennamoun
PDF
AFF-Ttention! Affordances and Attention Models for Short-Term Object Interaction Anticipation Lorenzo Mur-Labadia, Ruben Martinez-Cantin, Jose J Guerrero, Giovanni Maria Farinella, Antonino Furnari
PDF
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin F Elsayed, Mohamed Elhoseiny
PDF
Affine Steerers for Structured Keypoint Description Georg Bökman, Johan Edstedt, Michael Felsberg, Fredrik Kahl
PDF
AFreeCA: Annotation-Free Counting for All Adriano D'Alessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh
PDF
Agent Attention: On the Integration of SoftMax and Linear Attention Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Siyuan Pan, Pengfei Wan, Shiji Song, Gao Huang
PDF
Agent3D-Zero: An Agent for Zero-Shot 3D Understanding Sha Zhang, Di Huang, Jiajun Deng, Shixiang Tang, Wanli Ouyang, Tong He, Yanyong Zhang
PDF
Agglomerative Token Clustering Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor, Thomas B. Moeslund
PDF
AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Misha Sra, Pradeep Sen
PDF
Align Before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception Dingkang Yang, Dingkang Yang, Ke Li, Dongling Xiao, Zedian Shao, Peng Sun, Liang Song
PDF
AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation Ri-Zhao Qiu, Yu-Xiong Wang, Kris Hauser
PDF
Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models Rining Wu, Feixiang Zhou, Ziwei Yin, Jian Liu
PDF
Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences Shishir Reddy Vutukur, Junwen Huang, Rasmus Laurvig Haugaard, Benjamin Busam, Tolga Birdal
PDF
AlignZeg: Mitigating Objective Misalignment for Zero-Shot Semantic Segmentation Jiannan Ge, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian
PDF
All You Need Is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face Generation Seongho Kim, Byung Cheol Song
PDF
Alternate Diverse Teaching for Semi-Supervised Medical Image Segmentation Zhen Zhao, Zicheng Wang, Dian Yu, Longyue Wang, Yixuan Yuan, Luping Zhou
PDF
AMD: Automatic Multi-Step Distillation of Large-Scale Vision Models Cheng Han, Qifan Wang, Sohail A Dianat, Majid Rabbani, Raghuveer Rao, Yi Fang, Qiang Guan, Lifu Huang, Dongfang Liu
PDF
AMEGO: Active Memory from Long EGOcentric Videos Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta, Dima Damen
PDF
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-Level Retrieval Pavel Suma, Giorgos Kordopatis-Zilos, Ahmet Iscen, Giorgos Tolias
PDF
An Accurate Detection Is Not All You Need to Combat Label Noise in Web-Noisy Datasets Paul Albert, Kevin McGuinness, Eric Arazo, Tarun Krishna, Noel O Connor, Jack Valmadre
PDF
An Adaptive Screen-Space Meshing Approach for Normal Integration Moritz Heep, Eduard Zell
PDF
An Economic Framework for 6-DoF Grasp Detection Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang, Dian Zheng, Yi-Lin Wei, Wei-Shi Zheng
PDF
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding Wei Chen, Long Chen, Yu Wu
PDF
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Zhiyu Tan, Mengping Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li
PDF
An Explainable Vision Question Answer Model via Diffusion Chain-of-Thought Chunhao Lu, Qiang Lu, Jake Luo
PDF
An Image Is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Chang
PDF
An Incremental Unified Framework for Small Defect Inspection Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen, Fugee Tsung
PDF
An Information Theoretical View for Out-of-Distribution Detection Hu Jinjing, Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
PDF
An Optimal Control View of LoRA and Binary Controller Design for Vision Transformers Chi Zhang, Jingpu Cheng, Qianxiao Li
PDF
An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong, Liefeng Bo, Qixing Huang
PDF
Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction Dian Jia, Xiaoqian Ruan, Kun Xia, Zhiming Zou, Le Wang, Wei Tang
PDF
Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration Zhihao Liang, Qi Zhang, Wenbo Hu, Ying Feng, Lei Zhu, Kui Jia
PDF
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-Guided Self-Masking Yuheng Li, Tianyu Luan, Yizhou Wu, Shaoyan Pan, Yenho Chen, Xiaofeng Yang
PDF
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos Remy Sabathier, David Novotny, Niloy Mitra
PDF
AnimatableDreamer: Text-Guided Non-Rigid 3D Model Generation and Reconstruction with Canonical Score Distillation Xinzhou Wang, Yikai Wang, Junliang Ye, Fuchun Sun, Zhengyi Wang, Ling Wang, Pengkun Liu, Kai Sun, Xintong Wang, Xie Wende, Fangfu Liu, Bin He
PDF
Animate Your Motion: Turning Still Images into Dynamic Videos Mingxiao Li, Bo Wan, Sien Moens, Tinne Tuytelaars
PDF
AnimateMe: 4D Facial Expressions via Diffusion Models Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias, Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Stefanos Zafeiriou
PDF
Any Target Can Be Offense: Adversarial Example Generation via Generalized Latent Infection Youheng Sun, Shengming Yuan, Xuanhan Wang, Lianli Gao, Jingkuan Song
PDF
Any2Point: Empowering Any-Modality Transformers for Efficient 3D Understanding Yiwen Tang, Ray Zhang, Jiaming Liu, Zoey Guo, Bin Zhao, Zhigang Wang, Dong Wang, Peng Gao, Hongsheng Li, Xuelong Li
PDF
AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation Yanan Sun, Yanchen Liu, Yinhao Tang, Wenjie Pei, Kai Chen
PDF
AnyHome: Open-Vocabulary Large-Scale Indoor Scene Generation with First-Person View Exploration Rao Fu, Zehao Wen, Zichen Liu, Srinath Sridhar
PDF
Anytime Continual Learning for Open Vocabulary Classification Zhen Zhu, Yiming Gong, Derek Hoiem
PDF
APL: Anchor-Based Prompt Learning for One-Stage Weakly Supervised Referring Expression Comprehension Yaxin Luo, Jiayi Ji, Xiaofu Chen, Yuxin Zhang, Tianhe Ren, Gen Luo
PDF
Appearance-Based Refinement for Object-Centric Motion Segmentation Junyu Xie, Weidi Xie, Andrew Zisserman
PDF
Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene Ruiyang Zhang, Hu Zhang, Hang Yu, Zhedong Zheng
PDF
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma
PDF
Arc2Face: A Foundation Model for ID-Consistent Human Faces Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Jiankang Deng, Bernhard Kainz, Stefanos Zafeiriou
PDF
Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection? Rosario Leonardi, Antonino Furnari, Francesco Ragusa, Giovanni Maria Farinella
PDF
ARoFace: Alignment Robustness to Improve Low-Quality Face Recognition Mohammad Saeed Ebrahimi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei, Nasser Nasrabadi
PDF
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling William Yicheng Zhu, Keren Ye, Junjie Ke, Jiahui Yu, Leonidas Guibas, Peyman Milanfar, Feng Yang
PDF
Assessing Sample Quality via the Latent Space of Generative Models Jingyi Xu, Hieu Le, Dimitris Samaras
PDF
Asymmetric Mask Scheme for Self-Supervised Real Image Denoising Xiangyu Liao, Tianheng Zheng, Jiayu Zhong, Pingping Zhang, Chao Ren
PDF
Asynchronous Bioplausible Neuron for Spiking Neural Networks for Event-Based Vision Hussain Sajwani, Dimitrios Makris, Yahya Prof. Zweiri, Fariborz Baghaei Naeini, Sanket Mr Kachole
PDF
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving Yuan Chen, Zi-han Ding, Ziqin Wang, Yan Wang, Lijun Zhang, Si Liu
PDF
Attention Beats Linear for Fast Implicit Neural Representation Generation Shuyi Zhang, Ke Liu, Jingjun Gu, Xiaoxu Cai, Zhihua Wang, Jiajun Bu, Haishuai Wang
PDF
Attention Decomposition for Cross-Domain Semantic Segmentation Liqiang He, Sinisa Todorovic
PDF
Attention Prompting on Image for Large Vision-Language Models Runpeng Yu, Weihao Yu, Xinchao Wang
PDF
Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification Yunlong Zhang, Honglin Li, Yuxuan Sun, Chenglu Zhu, Sunyi Zheng, Lin Yang
PDF
AttentionHand: Text-Driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild Junho Park, Kyeongbo Kong, Suk-Ju Kang
PDF
AttnZero: Efficient Attention Discovery for Vision Transformers Lujun Li, Zimian Wei, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu, Yike Guo
PDF
Audio-Driven Talking Face Generation with Stabilized Synchronization Loss Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Hazim Kemal Ekenel, Alexander Waibel
PDF
Audio-Synchronized Visual Animation Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado
PDF
Audio-Visual Generalized Zero-Shot Learning the Easy Way Shentong Mo, Pedro Morgado
PDF
AUFormer: Vision Transformers Are Parameter-Efficient Facial Action Unit Detectors Kaishen Yuan, Zitong Yu, Xin Liu, Weicheng Xie, Huanjing Yue, Jingyu Yang
PDF
AugDETR: Improving Multi-Scale Learning for Detection Transformer Jinpeng Dong, Yutong Lin, Chen Li, Sanping Zhou, Nanning Zheng
PDF
Augmented Neural Fine-Tuning for Efficient Backdoor Purification Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Nazanin Rahnavard
PDF
AugUndo: Scaling up Augmentations for Monocular Depth Completion and Estimation Yangchao Wu, Tian Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong
PDF
Auto-DAS: Automated Proxy Discovery for Training-Free Distillation-Aware Architecture Search Haosen Sun, Lujun Li, Peijie Dong, Zimian Wei, Shitong Shao
PDF
Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search Lujun Li, Haosen Sun, Shiwen Li, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu, Yike Guo
PDF
AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion Yitong Jiang, Zhaoyang Zhang, Tianfan Xue, Jinwei Gu
PDF
AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering Xiuyuan Chen, Yuan Lin, Yuchen Zhang, Weiran Huang
PDF
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos Ekta Prashnani, Koki Nagano, Shalini De Mello, David P Luebke, Orazio Gallo
PDF
AvatarPose: Avatar-Guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-View Videos Feichi Lu, Zijian Dong, Jie Song, Otmar Hilliges
PDF
AWOL: Analysis WithOut Synthesis Using Language Silvia Zuffi, Michael J. Black
PDF
Background Adaptation with Residual Modeling for Exemplar-Free Class-Incremental Semantic Segmentation Anqi Zhang, Guangyu Gao
PDF
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding Talfan Evans, Shreya Pathak, Hamza Merzic, Jonathan Richard Schwarz, Ryutaro Tanno, Olivier Henaff
PDF
BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting Lingzhe Zhao, Peng Wang, Peidong Liu
PDF
BAFFLE: A Baseline of Backpropagation-Free Federated Learning Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin
PDF
BAGS: Blur Agnostic Gaussian Splatting Through Multi-Scale Kernel Modeling Cheng Peng, Yutao Tang, Yifan Zhou, Nengyu Wang, Xijun Liu, Deming Li, Rama Chellappa
PDF
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos Pilhyeon Lee, Hyeran Byun
PDF
BAMM: Bidirectional Autoregressive Motion Model Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen
PDF
BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression Yufeng Zhang, Hang Yu, Shizhan Liu, Wenrui Dai, Weiyao Lin
PDF
Bayesian Detector Combination for Object Detection with Crowdsourced Annotations Zhi Qin Tan, Olga Isupova, Gustavo Carneiro, Xiatian Zhu, Yunpeng Li
PDF
Bayesian Evidential Deep Learning for Online Action Detection Hongji Guo, Hanjing Wang, Qiang Ji
PDF
Bayesian Self-Training for Semi-Supervised 3D Segmentation Ozan Unal, Christos Sakaridis, Luc Van Gool
PDF
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Omer Dahary, Or Patashnik, Kfir Aberman, Danny Cohen-Or
PDF
Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li
PDF
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-Language Models Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Tae-Hyun Oh
PDF
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation Zikai Huang, Xuemiao Xu, Cheng Xu, Huaidong Zhang, Chenxi Zheng, Jing Qin, Shengfeng He
PDF
BenchLMM: Benchmarking Cross-Style Visual Capability of Large Multimodal Models Rizhao Cai, Zirui Song, Dayan Guan, Zhenhao Chen, Yaohang Li, Xing Luo, Chenyu Yi, Alex Kot
PDF
Benchmarking Object Detectors with COCO: A New Path Forward Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai
PDF
Benchmarking Spurious Bias in Few-Shot Image Classifiers Guangtao Zheng, Wenqian Ye, Aidong Zhang
PDF
Benchmarking the Robustness of Cross-View Geo-Localization Models Qingwang Zhang, Yingying Zhu
PDF
Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung Jin Chang, Angela Yao
PDF
BeNeRF:Neural Radiance Fields from a Single Blurry Image and Event Stream Wenpu Li, Pian Wan, Peng Wang, Jinghang Li, Yi Zhou, Peidong Liu
PDF
Beta-Tuned Timestep Diffusion Model Tianyi Zheng, Peng-Tao Jiang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang, Bo Li
PDF
Betrayed by Attention: A Simple yet Effective Approach for Self-Supervised Video Object Segmentation Shuangrui Ding, Rui Qian, Haohang Xu, Dahua Lin, Hongkai Xiong
PDF
Better Call SAL: Towards Learning to Segment Anything in LiDAR Aljosa Osep, Tim Meinhardt, Francesco Ferroni, Neehar Peri, Deva Ramanan, Laura Leal-Taixé
PDF
Better Regression Makes Better Test-Time Adaptive 3D Object Detection Jiakang Yuan, Bo Zhang, Kaixiong Gong, Xiangyu Yue, Botian Shi, Yu Qiao, Tao Chen
PDF
Beyond MOT: Semantic Multi-Object Tracking Yunhao Li, Qin Li, Hao Wang, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan, Libo Zhang
PDF
Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-Scale Patch-Based Multi-Label Classifier Prantik Howlader, Srijan Das, Hieu Le, Dimitris Samaras
PDF
Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning Xinyuan Gao, Songlin Dong, Yuhang He, Qiang Wang, Yihong Gong
PDF
Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-Trained 2D Diffusion Models Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Joo
PDF
Beyond the Data Imbalance: Employing the Heterogeneous Datasets for Vehicle Maneuver Prediction Hyeongseok Jeon, Sanmin Kim, Abi Rahman Syamil, Junsoo Kim, Dongsuk Kum
PDF
Beyond Viewpoint: Robust 3D Object Recognition Under Arbitrary Views Through Joint Multi-Part Representation Linlong Fan, Ye Huang, Yanqi Ge, Wen Li, Lixin Duan
PDF
BeyondScene: Higher-Resolution Human-Centric Scene Generation with Pretrained Diffusion Gwanghyun Kim, Hayeon Kim, Hoigi Seo, Dong Un Kang, Se Young Chun
PDF
Bi-Directional Contextual Attention for 3D Dense Captioning Minjung Kim, Hyung Suk Lim, Soonyoung Lee, Bumsoo Kim, Gunhee Kim
PDF
BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee, Kang Zhang, Yu-Jung Heo, Du-Seong Chang, Chang D. Yoo
PDF
Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement Haodong Li, Hao Lu, Yingcong Chen
PDF
Bidirectional Progressive Transformer for Interaction Intention Anticipation Zichen Zhang, Hongchen Luo, Wei Zhai, Yu Kang, Yang Cao
PDF
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model Zhening Liu, Xinjie Zhang, Jiawei Shao, Zehong Lin, Jun Zhang
PDF
Bidirectional Uncertainty-Based Active Learning for Open-Set Annotation Chen-Chen Zong, Ye-Wen Wang, Kun-Peng Ning, Hai-Bo Ye, Sheng-Jun Huang
PDF
Binomial Self-Compensation for Motion Error in Dynamic 3D Scanning Geyou Zhang, Ce Zhu, Kai Liu
PDF
BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi
PDF
BKDSNN: Enhancing the Performance of Learning-Based Spiking Neural Networks Training with Blurred Knowledge Distillation Zekai Xu, Kang You, Qinghai Guo, Xiang Wang, Zhezhi He
PDF
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering Xinmin Qiu, Congying Han, Zicheng Zhang, Bonan Li, Tiande Guo, Pingyu Wang, Xuecheng Nie
PDF
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models Ian Huang, Guandao Yang, Leonidas Guibas
PDF
Blind Image Deblurring with Noise-Robust Kernel Estimation Chanseok Lee, Jeongsol Kim, Seungmin Lee, Jaehwang Jung, Yunje Cho, Taejoong Kim, Taeyong Jo, Myungjun Lee, Mooseok Jang
PDF
Blind Image Deconvolution by Generative-Based Kernel Prior and Initializer via Latent Encoding Jiangtao Zhang, Zongsheng Yue, Hui Wang, Qian Zhao, Deyu Meng
PDF
BLINK: Multimodal Large Language Models Can See but Not Perceive Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A Smith, Wei-Chiu Ma, Ranjay Krishna
PDF
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation Using RGB Frames and Events Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang, Hongsheng Li
PDF
Bones Can't Be Triangles: Accurate and Efficient Vertebrae Keypoint Estimation Through Collaborative Error Revision Jinhee Kim, Taesung Kim, Jaegul Choo
PDF
Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering Francesco Di Sario, Riccardo Renzulli, Marco Grangetto, Enzo Tartaglione
PDF
Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-Training Qiangqiang Wu, Yan Xia, Jia Wan, Antoni Chan
PDF
Boosting Gaze Object Prediction via Pixel-Level Supervision from Vision Foundation Model Yang Jin, Lei Zhang, Shi Yan, Bin Fan, Binglu Wang
PDF
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training Cheng Tan, Jingxuan Wei, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Ruifeng Guo, BiHui Yu, Stan Z. Li
PDF
Boosting Transferability in Vision-Language Attacks via Diversification Along the Intersection Region of Adversarial Trajectory Sensen Gao, Xiaojun Jia, Xuhong Ren, Ivor Tsang, Qing Guo
PDF
Bottom-up Domain Prompt Tuning for Generalized Face Anti-Spoofing Siqi Liu, Qirui Wang, Pong C. Yuen
PDF
Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals Camilo L Fosco, Benjamin Lahner, Bowen Pan, Alex Andonian, Emilie L Josephs, Alex Lascelles, Aude Oliva
PDF
Brain-ID: Learning Contrast-Agnostic Anatomical Representations for Brain Imaging Peirong Liu, Oula Puonti, Xiaoling Hu, Daniel C. Alexander, Juan E. Iglesias
PDF
BRAVE: Broadening the Visual Encoding of Vision-Language Models Oğuzhan Fatih Kar, Alessio Tonioni, Petra Poklukar, Achin Kulshrestha, Amir Zamir, Federico Tombari
PDF
Bridge past and Future: Overcoming Information Asymmetry in Incremental Object Detection Qijie Mo, Yipeng Gao, Shenghao Fu, Junkai Yan, Ancong Wu, Wei-Shi Zheng
PDF
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
PDF
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe Xu, Kwan-Yee K. Wong
PDF
Bridging Synthetic and Real Worlds for Pre-Training Scene Text Detectors Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang
PDF
Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu
PDF
Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture ShahRukh Athar, Shunsuke Saito, Stanislav Pidhorskyi, Zhengyu Yang, Chen Cao
PDF
Bridging the Pathology Domain Gap: Efficiently Adapting CLIP for Pathology Image Analysis with Limited Labeled Data Zhengfeng Lai, Joohi Chauhan, Brittany N. Dugger, Chen-Nee Chuah
PDF
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, Qiang Xu
PDF
Bucketed Ranking-Based Losses for Efficient Training of Object Detectors Feyza Yavuz, Baris Can Cam, Adnan Harun Dogan, Kemal Oksuz, Emre Akbas, Sinan Kalkan
PDF
BugNIST - A Large Volumetric Dataset for Detection Under Domain Shift Patrick M Jensen, Vedrana A Dahl, Rebecca Engberg, Carsten Gundlach, Hans Martin Kjer, Anders B Dahl
PDF
BurstM: Deep Burst Multi-Scale SR Using Fourier Space with Optical Flow EungGu Kang, Byeonghun Lee, Sunghoon Im, Kyong Hwan Jin
PDF
ByteEdit: Boost, Comply and Accelerate Generative Image Editing Yuxi Ren, Jie Wu, Yanzuo Lu, Huafeng Kuang, Xin Xia, Xionghui Wang, Qianqian Wang, Yixing Zhu, Pan Xie, Shiyin Wang, Xuefeng Xiao, Yitong Wang, Min Zheng, Lean Fu
PDF
C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition Rongchang Li, Zhenhua Feng, Tianyang Xu, Linze Li, Xiao-Jun Wu, Muhammad Awais, Sara Atito, Josef Kittler
PDF
CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches Sifan Wu, Amir Hosein Khasahmadi, Mor Katz, Pradeep Kumar Jayaraman, Yewen Pu, Karl D.D. Willis, Bang Liu
PDF
CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering Haidong Zhu, Tianyu Ding, Tianyi Chen, Ilya Zharkov, Ram Nevatia, Luming Liang
PDF
Caltech Aerial RGB-Thermal Dataset in the Wild Connor Lee, Matthew Anderson, Nikhil Ranganathan, Xingxing Zuo, Kevin T Do, Georgia Gkioxari, Soon-Jo Chung
PDF
Camera Calibration Using a Collimator System Shunkun Liang, Banglei Guan, Zhenbao Yu, Pengju Sun, Yang Shang
PDF
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation Genki Kinoshita, Ko Nishino
PDF
Camera-LiDAR Cross-Modality Gait Recognition Wenxuan Guo, Yingping Liang, Zhiyu Pan, Ziheng Xi, Jianjiang Feng, Jie Zhou
PDF
CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection Xunfa Lai, Zhiyu Yang, Jie Hu, ShengChuan Zhang, Liujuan Cao, Guannan Jiang, Songan Zhang, Zhiyu Wang, Rongrong Ji
PDF
Can OOD Object Detectors Learn from Foundation Models? Jiahui Liu, Xin Wen, Shizhen Zhao, Yingxian Chen, Xiaojuan Qi
PDF
Can Textual Semantics Mitigate Sounding Object Segmentation Preference? Yaoting Wang, Peiwen Sun, Yuanchao Li, Honggang Zhang, Di Hu
PDF
Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning Ali Cheraghian, Zeeshan Hayder, Sameeea Ramasinghe, Shafin Rahman, Javad Jafaryahya, Lars Petersson, Mehrtash Harandi
PDF
CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images Jisu Shin, Junmyeong Lee, Seongmin Lee, Min-Gyu Park, Jumi Kang, Ju Hong Yoon, Hae-Gon Jeon
PDF
CARB-Net: Camera-Assisted Radar-Based Network for Vulnerable Road User Detection Wei-Yu Lee, Martin Dimitrievski, David Van Hamme, Jan Aelterman, Ljubomir Jovanov, Wilfried Philips
PDF
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos Jiewen Yang, Yiqun Lin, Bin Pu, Jiarong Guo, Xiaowei Xu, Xiaomeng Li
PDF
CARFF: Conditional Auto-Encoded Radiance Field for 3D Scene Forecasting Jiezhi Yang, Khushi P Desai, Charles Packer, Harshil Bhatia, Nicholas Rhinehart, Rowan McAllister, Joseph E Gonzalez
PDF
CarFormer: Self-Driving with Learned Object-Centric Representations Shadi Hamdan, Fatma Guney
PDF
Cascade Prompt Learning for Visual-Language Model Adaptation Ge Wu, Xin Zhang, Zheng Li, Zhaowei Chen, Jiajun Liang, Jian Yang, Xiang Li
PDF
Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views Yabo Chen, Jiemin Fang, Yuyang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
PDF
CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu
PDF
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
PDF
Catastrophic Overfitting: A Potential Blessing in Disguise Mn Zhao, Lihe Zhang, Yuqiu Kong, Baocai Yin
PDF
CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing Haibo Jin, Ruoxi Chen, Jinyin Chen, Haibin Zheng, Yang Zhang, Haohan Wang
PDF
Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery Grzegorz Rypeść, Daniel Marczak, Sebastian Cygert, Tomasz Trzcinski, Bartlomiej Twardowski
PDF
Category-Level Object Detection, Pose Estimation and Reconstruction from Stereo Images Chuanrui Zhang, Yonggen Ling, Minglei Lu, Minghan Qin, Haoqian Wang
PDF
Causal Subgraphs and Information Bottlenecks: Redefining OOD Robustness in Graph Neural Networks Weizhi An, Wenliang Zhong, Feng Jiang, Hehuan Ma, Junzhou Huang
PDF
Causality-Inspired Discriminative Feature Learning in Triple Domains for Gait Recognition Haijun Xiong, Bin Feng, Xinggang Wang, Wenyu Liu
PDF
CC-SAM: Enhancing SAM with Cross-Feature Attention and Context for Ultrasound Image Segmentation Shreyank N Gowda, David A Clifton
PDF
cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process Yihang Chen, Tsai Hor Chan, Guosheng Yin, Yuming Jiang, Lequan Yu
PDF
Centering the Value of Every Modality: Towards Efficient and Resilient Modality-Agnostic Semantic Segmentation Xu Zheng, Yuanhuiyi Lyu, Jiazhou Zhou, Lin Wang
PDF
Certifiably Robust Image Watermark Zhengyuan Jiang, Moyang Guo, Yuepeng Hu, Jinyuan Jia, Neil Zhenqiang Gong
PDF
CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-Aware 3D Gaussian Field Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui
PDF
Chains of Diffusion Models Yanheng Wei, Lianghua Huang, Zhi-Fan Wu, Wei Wang, Yu Liu, Mingda Jia, Shuailei Ma
PDF
Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning Chongyu Fan, Jiancheng Liu, Alfred Hero, Sijia Liu
PDF
Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild Donggyun Kim, Seongwoong Cho, Semin Kim, Chong Luo, Seunghoon Hong
PDF
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Zilong Dong, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu
PDF
Characterizing Model Robustness via Natural Input Gradients Adrian Rodriguez-Munoz, Tongzhou Wang, Antonio Torralba
PDF
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts Shuangkang Fang, Yufeng Wang, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang
PDF
ChEX: Interactive Localization and Region Description in Chest X-Rays Philip Müller, Georgios Kaissis, Daniel Rueckert
PDF
Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models Kent Fujiwara, Mikihiro Tanaka, Qing Yu
PDF
CIC-BART-SSA: : Controllable Image Captioning with Structured Semantic Augmentation Kalliopi Basioti, Mohamed A Abdelsalam, Federico Fancellu, Vladimir Pavlovic, Afsaneh Fazly
PDF
CipherDM: Secure Three-Party Inference for Diffusion Model Sampling Xin Zhao, Xiaojun Chen, Xudong Chen, He Li, Tingyu Fan, Zhendong Zhao
PDF
City-on-Web: Real-Time Neural Rendering of Large-Scale Scenes on the Web Kaiwen Song, Xiaoyi Zeng, Chenqu Ren, Juyong Zhang
PDF
CityGaussian: Real-Time High-Quality Large-Scale Scene Rendering with Gaussians Yang Liu, Chuanchen Luo, Lue Fan, Naiyan Wang, Junran Peng, Zhaoxiang Zhang
PDF
CityGuessr: City-Level Video Geo-Localization on a Global Scale Parth Parag Kulkarni, Gaurav Kumar Nayak, Mubarak Shah
PDF
CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs Akshat Ramachandran, Souvik Kundu, Tushar Krishna
PDF
CLAP: Isolating Content from Style Through Contrastive Learning with Augmented Prompts Yichao Cai, Yuhang Liu, Zhen Zhang, Javen Qinfeng Shi
PDF
Class-Agnostic Object Counting with Text-to-Image Diffusion Model Xiaofei Hui, Qian Wu, Hossein Rahmani, Jun Liu
PDF
Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion Linlan Huang, Xusheng Cao, Haori Lu, Xialei Liu
PDF
Classification Matters: Improving Video Action Detection with Class-Specific Attention Jinsung Lee, Taeoh Kim, Inwoong Lee, Minho Shim, Dongyoon Wee, Minsu Cho, Suha Kwak
PDF
Clean & Compact: Efficient Data-Free Backdoor Defense with Model Compactness Huy Phan, Jinqi Xiao, Yang Sui, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
PDF
ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang
PDF
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation Zhihang Zhong, Gurunandan Krishnan, Xiao Sun, Yu Qiao, Sizhuo Ma, Jian Wang
PDF
CLEO: Continual Learning of Evolving Ontologies Shishir Muralidhara, Saqib Bukhari, Georg Dr. Schneider, Didier Stricker, René Schuster
PDF
Click Prompt Learning with Optimal Transport for Interactive Segmentation Jie Liu, Haochen Wang, Wenzhe Yin, Jan-Jakob Sonke, Efstratios Gavves
PDF
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim, Hoseok Do
PDF
CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection Wuyang Li, Xinyu Liu, Jiayi Ma, Yixuan Yuan
PDF
CliffPhys: Camera-Based Respiratory Measurement Using Clifford Neural Networks Omar Ghezzi, Giuseppe Boccignone, Giuliano Grossi, Raffaella Lanzarotti, Alessandro D'Amelio
PDF
CLIP-DINOiser: Teaching CLIP a Few DINO Tricks for Open-Vocabulary Semantic Segmentation Monika Wysoczańska, Oriane Siméoni, Michaël Ramamonjisoa, Andrei Bursuc, Tomasz Trzciński, Patrick Pérez
PDF
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs Yassine Ouali, Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
PDF
CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks Hao Fang, Jiawei Kong, Bin Chen, Tao Dai, Hao Wu, Shu-Tao Xia
PDF
Close, but Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition Sergio Izquierdo, Javier Civera
PDF
Closed-Loop Unsupervised Representation Disentanglement with $\\beta$-VAE Distillation and Diffusion Probabilistic Feedback Xin Jin, Bohan Li, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng
PDF
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning Junghun Oh, Sungyong Baik, Kyoung Mu Lee
PDF
CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation Hajin Shim, Changhun Kim, Eunho Yang
PDF
CLR-GAN: Improving GANs Stability and Quality via Consistent Latent Representation and Reconstruction Shengke Sun, Ziqian Luan, Zhanshan Zhao, Shijie Luo, Shuzhen Han
PDF
ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition Tianhao Wu, Chuanxia Zheng, Qianyi Wu, Tat-Jen Cham
PDF
CMD: A Cross Mechanism Domain Adaptation Dataset for 3D Object Detection Jinhao Deng, Wei Ye, Hai Wu, Qiming Xia, Xun Huang, Xin Li, Jin Fang, Wei Li, Chenglu Wen, Cheng Wang
PDF
CMTA: Cross-Modal Temporal Alignment for Event-Guided Video Deblurring Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon
PDF
Co-Speech Gesture Video Generation with 3D Human Meshes Aniruddha Mahapatra, Richa Mishra, Ziyi Chen, Boyang Ding, Renda Li, Shoulei Wang, Jun-Yan Zhu, Peng Chang, Mei Han, Jing Xiao
PDF
Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection Lianjun Wu, Jiangxiao Han, Zengqiang Zheng, Xinggang Wang
PDF
Co-Synthesis of Histopathology Nuclei Image-Label Pairs Using a Context-Conditioned Joint Diffusion Model Seonghui Min, Hyun-Jic Oh, Won-Ki Jeong
PDF
Coarse-to-Fine Implicit Representation Learning for 3D Hand-Object Reconstruction from a Single RGB-D Image Xingyu Liu, Pengfei Ren, Jingyu Wang, Qi Qi, Haifeng Sun, Zirui Zhuang, Jianxin Liao
PDF
Cocktail Universal Adversarial Attack on Deep Neural Networks Shaoxin Li, Xiaofeng Liao, Xin Che, Xintong Li, Yong Zhang, Lingyang Chu
PDF
COD: Learning Conditional Invariant Representation for Domain Adaptation Regression Hao-Ran Yang, Chuan-Xian Ren, You-Wei Luo
PDF
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning ZiYang Gong, FuHao Li, Yupeng Deng, Deblina Bhattacharjee, Xianzheng Ma, Xiangwei Zhu, Zhenming Ji
PDF
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion Wendi Zheng, Jiayan Teng, Zhuoyi Yang, Weihan Wang, Jidong Chen, Xiaotao Gu, Yuxiao Dong, Ming Ding, Jie Tang
PDF
CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi Kalantari
PDF
COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation Liu He, Daniel Aliaga
PDF
COIN-Matting: Confounder Intervention for Image Matting Zhaohe Liao, Jiangtong Li, Jun Lan, Huijia Zhu, Weiqiang Wang, Li Niu, Liqing Zhang
PDF
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation Jiefeng Li, Ye Yuan, Davis Rempe, Haotian Zhang, Pavlo Molchanov, Cewu Lu, Jan Kautz, Umar Iqbal
PDF
CoLA: Conditional Dropout and Language-Driven Robust Dual-Modal Salient Object Detection Shuang Hao, Chunlin Zhong, He Tang
PDF
CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing Faegheh Sardari, Armin Mustafa, Philip JB Jackson, Adrian Hilton
PDF
Collaborative Control for Geometry-Conditioned PBR Image Generation Shimon Vainer, Mark Boss, Mathias Parger, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Nicolas Perony, Simon Donné
PDF
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation Siyu Jiao, Hongguang Zhu, Yunchao Wei, Yao Zhao, Jiannan Huang, Humphrey Shi
PDF
ColorMAE: Exploring Data-Independent Masking Strategies in Masked AutoEncoders Carlos Hinojosa, Shuming Liu, Bernard Ghanem
PDF
ColorMNet: A Memory-Based Deep Spatial-Temporal Feature Propagation Network for Video Colorization Yixin Yang, Jiangxin Dong, Jinhui Tang, Jinshan Pan
PDF
ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral, Joost van de Weijer
PDF
COM Kitchens: An Unedited Overhead-View Procedural Videos Dataset a Vision-Language Benchmark Atsushi Hashimoto, Koki Maeda, Tosho Hirasawa, Jun Harashima, Leszek Rybicki, Yusuke Fukasawa, Yoshitaka Ushiku
PDF
Combining Generative and Geometry Priors for Wide-Angle Portrait Correction Lan Yao, Chaofeng Chen, Xiaoming Li, Zifei Yan, Wangmeng Zuo
PDF
ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance Yongwei Chen, Tengfei Wang, Tong Wu, Xingang Pan, Kui Jia, Ziwei Liu
PDF
ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion Yan Hong, Yuxuan Duan, Bo Zhang, Haoxing Chen, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang
PDF
Common Sense Reasoning for Deep Fake Detection Yue Zhang, Ben Colman, Xiao Guo, Ali Shahriyari, Gaurav Bharaj
PDF
Commonly Interesting Images Fitim Abdullahu, Helmut Grabner
PDF
COMO: Compact Mapping and Odometry Eric Dexheimer, Andrew Davison
PDF
CoMo: Controllable Motion Generation Through Language Guided Pose Code Editing Yiming Huang, Weilin Wan, Yue Yang, Chris Callison-Burch, Mark Yatskar, Lingjie Liu
PDF
Compact 3D Scene Representation via Self-Organizing Gaussian Grids Wieland Morgenstern, Florian Barthel, Anna Hilsmann, Peter Eisert
PDF
Compensation Sampling for Improved Convergence in Diffusion Models Hui Lu, Albert Ali Salah, Ronald Poppe
PDF
CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization K L Navaneet, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, Hamed Pirsiavash
PDF
COMPOSE: Comprehensive Portrait Shadow Editing Andrew Z Hou, Zhixin Shu, Xuaner Zhang, He Zhang, Yannick Hold-Geoffroy, Jae Shin Yoon, Xiaoming Liu
PDF
Compositional Substitutivity of Visual Reasoning for Visual Question Answering Chuanhao Li, Zhen Li, Chenchen Jing, Yuwei Wu, Mingliang Zhai, Yunde Jia
PDF
Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector Xianren Zhang, Dongwon Lee, Suhang Wang
PDF
Compress3D: A Compressed Latent Space for 3D Generation from a Single Image Bowen Zhang, Tianyu Yang, Yu Li, Lei Zhang, Xi Zhao
PDF
Computing the Lipschitz Constant Needed for Fast Scene Recovery from CASSI Measurements Niels Chr Overgaard, Anders Holst
PDF
CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion Jiarui Sun, Girish Chowdhary
PDF
Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models Vitali Petsiuk, Kate Saenko
PDF
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models Rohit Gandikota, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau
PDF
ConceptExpress: Harnessing Diffusion Models for Single-Image Unsupervised Concept Extraction Shaozhe Hao, Kai Han, Zhengyao Lv, Shihao Zhao, Kwan-Yee K. Wong
PDF
Conceptual Codebook Learning for Vision-Language Models Yi Zhang, Ke Yu, Siqi Wu, Zhihai He
PDF
Concise Plane Arrangements for Low-Poly Surface and Volume Modelling Raphael Sulzer, Florent Lafarge
PDF
CONDA: Condensed Deep Association Learning for Co-Salient Object Detection. Long Li, Nian Liu, Dingwen Zhang, Zhongyu Li, Salman Khan, Rao Anwer, Hisham Cholakkal, Junwei Han, Fahad Shahbaz Khan
PDF
ConDense: Consistent 2D-3D Pre-Training for Dense and Sparse Features from Multi-View Images Xiaoshuai Zhang, Zhicheng Wang, Howard Zhou, Soham Ghosh, Danushen L Gnanapragasam, Varun Jampani, Hao Su, Leonidas Guibas
PDF
Confidence Self-Calibration for Multi-Label Class-Incremental Learning Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Chen Lu, Guangcan Liu
PDF
Confidence-Based Iterative Generation for Real-World Image Super-Resolution Jialun Peng, Xin Luo, Jingjing Fu, Dong Liu
PDF
ConGeo: Robust Cross-View Geo-Localization Across Ground View Variations Li Mi, Chang Xu, Javiera Castillo Navarro, Syrielle Montariol, Wen Yang, Antoine Bosselut, Devis Tuia
PDF
Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation Zongrui Li, Minghui Hu, Qian Zheng, Xudong Jiang
PDF
Consistent 3D Line Mapping Xulong Bai, Hainan Cui, Shuhan Shen
PDF
Constructing Concept-Based Models to Mitigate Spurious Correlations with Minimal Human Effort Jeeyung Kim, Ze Wang, Qiang Qiu
PDF
Content-Aware Radiance Fields: Aligning Model Complexity with Scene Intricacy Through Learned Bitwidth Quantization Weihang Liu, Xue Xian Zheng, Jingyi Yu, Xin Lou
PDF
Context Diffusion: In-Context Aware Image Generation Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey, Dhruv Mahajan, Vignesh Ramanathan, Filip Radenovic
PDF
Context-Aware Action Recognition: Introducing a Comprehensive Dataset for Behavior Contrast Tatsuya Sasaki, Yoshiki Ito, Satoshi Kondo
PDF
Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation Zhenliang Ni, Xinghao Chen, Yingjie Zhai, Yehui Tang, Yunhe Wang
PDF
Contextual Correspondence Matters: Bidirectional Graph Matching for Video Summarization Yunzuo Zhang, Yameng Liu
PDF
Continual Learning and Unknown Object Discovery in 3D Scenes via Self-Distillation Mohamed El Amine Boudjoghra, Jean Lahoud, Salman Khan, Hisham Cholakkal, Rao M Anwer, Fahad Shahbaz Khan
PDF
Continual Learning for Remote Physiological Measurement: Minimize Forgetting and Simplify Inference Qian Liang, Yan Chen, Yang Hu
PDF
Continuity Preserving Online CenterLine Graph Learning Yunhui Han, Kun Yu, Zhiwei Li
PDF
Continuous Memory Representation for Anomaly Detection Joo Chan Lee, Taejune Kim, Eunbyung Park, Simon S Woo, Jong Hwan Ko
PDF
Continuous SO(3) Equivariant Convolution for 3D Point Cloud Analysis Jaein Kim, Hee Bin Yoo, Dong-Sig Han, Yeon-Ji Song, Byoung-Tak Zhang
PDF
Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution Xingyuan Li, Jinyuan Liu, Zhixin Chen, Yang Zou, Long Ma, Xin Fan, Risheng Liu
PDF
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities Lorenzo Baraldi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
PDF
Contrastive Ground-Level Image and Remote Sensing Pre-Training Improves Representation Learning for Natural World Imagery Andy V Huynh, Lauren Gillespie, Jael Lopez-Saucedo, Claire Tang, Rohan Sikand, Moisés Expósito-Alonso
PDF
Contrastive Learning with Synthetic Positives Dewen Zeng, Xinrong Hu, Yawen Wu, Xiaowei Xu, Yiyu Shi
PDF
Contrastive Region Guidance: Improving Grounding in Vision-Language Models Without Training David Wan, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansal
PDF
Contribution-Based Low-Rank Adaptation with Pre-Training Model for Real Image Restoration Dongwon Park, Hayeon Kim, Se Young Chun
PDF
ControlCap: Controllable Region-Level Captioning Yuzhong Zhao, Liu Yue, Zonghao Guo, Weijia Wu, Chen Gong, Qixiang Ye, Fang Wan
PDF
Controllable Contextualized Image Captioning: Directing the Visual Narrative Through User-Defined Highlights Shunqi Mao, Chaoyi Zhang, Hang Su, Hwanjun Song, Igor Shalyminov, Weidong Cai
PDF
Controllable Human-Object Interaction Synthesis Jiaman Li, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig, C. Karen Liu
PDF
Controllable Navigation Instruction Generation with Chain of Thought Prompting Xianghao Kong, Jinyu Chen, Wenguan Wang, Hang Su, Xiaolin Hu, Yi Yang, Si Liu
PDF
Controlling the World by Sleight of Hand Sruthi Sudhakar, Ruoshi Liu, Basile Van Hoorick, Carl Vondrick, Richard Zemel
PDF
ControlLLM: Augment Language Models with Tools by Searching on Graphs Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao, Erfei Cui, Ziheng Li, Xizhou Zhu, Lewei Lu, Qifeng Chen, Yu Qiao, Jifeng Dai, Wenhai Wang
PDF
ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother
PDF
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Ming Li, Taojiannan Yang, Huafeng Kuang, Jie Wu, Zhaoning Wang, Xuefeng Xiao, Chen Chen
PDF
Convex Relaxations for Manifold-Valued Markov Random Fields with Approximation Guarantees Robin Kenis, Emanuel Laude, Panagiotis Patrinos
PDF
CoPT: Unsupervised Domain Adaptive Segmentation Using Domain-Agnostic Text Embeddings Cristina Mata, Kanchana N Ranasinghe, Michael S Ryoo
PDF
CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization Jiawei Zhang, Jiahe Li, Xiaohan Yu, Lei Huang, Lin Gu, Jin Zheng, Xiao Bai
PDF
CoReS: Orchestrating the Dance of Reasoning and Segmentation Xiaoyi Bao, Siyang Sun, Shuailei Ma, Kecheng Zheng, Yuxin Guo, Guosheng Zhao, Yun Zheng, Xingang Wang
PDF
Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning Ray Zhang, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Ryan M. Eustice, Maani Ghaffari Jadidi, Arnie Sen
PDF
Correspondences of the Third Kind: Camera Pose Estimation from Object Reflection Kohei Yamashita, Vincent Lepetit, Ko Nishino
PDF
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems Jiankun Zhao, Bowen Song, Liyue Shen
PDF
COSMU: Complete 3D Human Shape from Monocular Unconstrained Images Marco Pesavento, Marco Volino, Adrian Hilton
PDF
CoTracker: It Is Better to Track Together Nikita Karaev, Ignacio Rocco, Ben Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht
PDF
CountFormer: Multi-View Crowd Counting Transformer Hong Mo, Xiong Zhang, Jianchao Tan, Cheng Yang, Qiong Gu, Bo Hang, Wenqi Ren
PDF
CPM: Class-Conditional Prompting Machine for Audio-Visual Segmentation Yuanhong Chen, Chong Wang, Yuyuan Liu, Hu Wang, Gustavo Carneiro
PDF
CPT-VR: Improving Surface Rendering via Closest Point Transform with View-Reflection Appearance Zhipeng Hu, Yongqiang Zhang, Chen Liu, Lincheng Li, Sida Peng, Xiaowei Zhou, Changjie Fan, Xin Yu
PDF
CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint Matching Samia Shafique, Shu Kong, Charless Fowlkes
PDF
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Zhengyi Wang, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu
PDF
CroMo-Mixup: Augmenting Cross-Model Representations for Continual Self-Supervised Learning Erum Mushtaq, Duygu Nur Yaldiz, Yavuz Faruk Bakman, Jie Ding, Chenyang Tao, Dimitrios Dimitriadis, Salman Avestimehr
PDF
Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector Yuqian Fu, Yu Wang, Yixuan Pan, Xingyu Qiu, Lian Huai, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc Van Gool, Xingqun Jiang
PDF
Cross-Domain Learning for Video Anomaly Detection with Limited Supervision Yashika Jain, Ali Dabouei, Min Xu
PDF
Cross-Domain Semantic Segmentation on Inconsistent Taxonomy Using VLMs Jeongkee Lim, Yusung Kim
PDF
Cross-Input Certified Training for Universal Perturbations Changming Xu, Gagandeep Singh
PDF
Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach Shizhou Zhang, Wenlong Luo, De Cheng, Qingchun Yang, Lingyan Ran, Yinghui Xing, Yanning Zhang
PDF
Cross-View Image Geo-Localization with Panorama-BEV Co-Retrieval Network Junyan Ye, Zhutao Lv, Weijia Li, Jinhua Yu, Haote Yang, Huaping Zhong, Conghui He
PDF
CrossGLG: LLM Guides One-Shot Skeleton-Based 3D Action Recognition in a Cross-Level Manner Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhou
PDF
CrossScore: A Multi-View Approach to Image Evaluation and Scoring Zirui Wang, Wenjing Bian, Victor Adrian Prisacariu
PDF
Crowd-SAM:SAM as a Smart Annotator for Object Detection in Crowded Scenes Zhi Cai, Yingjie Gao, Yaoyan Zheng, Nan Zhou, Di Huang
PDF
Cs2K: Class-Specific and Class-Shared Knowledge Guidance for Incremental Semantic Segmentation Wei Cong, Yang Cong, Yuyang Liu, Gan Sun
PDF
CSOT: Cross-Scan Object Transfer for Semi-Supervised LiDAR Object Detection Jinglin Zhan, Tiejun Liu, Rengang Li, Zhaoxiang Zhang, Yuntao Chen
PDF
CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models Nick Stracke, Stefan Andreas Baumann, Joshua Susskind, Miguel Angel Bautista, Bjorn Ommer
PDF
Curved Diffusion: A Generative Model with Optical Geometry Control Andrey Voynov, Amir Hertz, Moab Arar, Shlomi Fruchter, Daniel Cohen-Or
PDF
Customize-a-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models Yixuan Ren, Yang Zhou, Jimei Yang, Jing Shi, Difan Liu, Feng Liu, Mingi Kwon, Abhinav Shrivastava
PDF
Customized Generation Reimagined: Fidelity and Editability Harmonized Jian Jin, Yang Shen, Zhenyong Fu, Jian Yang
PDF
Cut Out the Middleman: Revisiting Pose-Based Gait Recognition Yang Fu, Saihui Hou, Shibei Meng, Xuecai Hu, Chunshui Cao, Xu Liu, Yongzhen Huang
PDF
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction Zhangchen Ye, Tao Jiang, Chenfeng Xu, Yiming Li, Hang Zhao
PDF
D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction Bowen Fu, Gu Wang, Chenyangguang Zhang, Yan Di, Ziqin Huang, Zhiying Leng, Fabian Manhardt, Xiangyang Ji, Federico Tombari
PDF
D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-on Zhaotong Yang, Zicheng Jiang, Xinzhe Li, Huiyu Zhou, Junyu Dong, Huaidong Zhang, Yong Du
PDF
DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception Kai Jiang, Jiaxing Huang, Weiying Xie, Jie Lei, Yunsong Li, Ling Shao, Shijian Lu
PDF
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition Qi Wang, Zhou Xu, Yuming Lin, Jingtao Ye, Hongsheng Li, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang
PDF
DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion Junjie Guo, Chenqiang Gao, Fangcen Liu, Deyu Meng, Xinbo Gao
PDF
Data Augmentation via Latent Diffusion for Saliency Prediction Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
PDF
Data Collection-Free Masked Video Modeling Yuchi Ishikawa, Masayoshi Kondo, Yoshimitsu Aoki
PDF
Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design Gen Li, Zhihao Shu, Jie Ji, Minghai Qin, Fatemeh Afghah, Wei Niu, Xiaolong Ma
PDF
Data Poisoning Quantization Backdoor Attack Tran Huynh, Anh Tran, Khoa Doan, Tung Pham
PDF
Data-to-Model Distillation: Data-Efficient Learning Framework Ahmad Sajedi, Samir Khaki, Lucy Z. Liu, Ehsan Amjadian, Yuri A. Lawryshyn, Konstantinos N. Plataniotis
PDF
DataDream: Few-Shot Guided Dataset Generation Jae Myung Kim, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata
PDF
Dataset Distillation by Automatic Training Trajectories Dai Liu, Jindong Gu, Hu Cao, Carsten Trinitis, Martin Schulz
PDF
Dataset Enhancement with Instance-Level Augmentations Orest Kupyn, Christian Rupprecht
PDF
Dataset Growth Ziheng Qin, Zhaopan Xu, YuKun Zhou, Kai Wang, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Radu Timofte, Xiaojiang Peng, Hongxun Yao, Yang You
PDF
Dataset Quantization with Active Learning Based Adaptive Sampling Zhenghao Zhao, Yuzhang Shang, Junyi Wu, Yan Yan
PDF
DatasetNeRF: Efficient 3D-Aware Data Factory with Generative Radiance Fields Yu Chi, Fangneng Zhan, Sibo Wu, Christian Theobalt, Adam Kortylewski
PDF
DATENeRF: Depth-Aware Text-Based Editing of NeRFs Sara Rojas Martinez, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavalli
PDF
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation Wenliang Zhao, Haolin Wang, Jie Zhou, Jiwen Lu
PDF
DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution Shrey Singh, Prateek Keserwani, Masakazu Iwamura, Partha Pratim Roy
PDF
De-Confounded Gaze Estimation Ziyang Liang, Yiwei Bao, Feng Lu
PDF
De-Confusing Pseudo-Labels in Source-Free Domain Adaptation Idit Diamant, Amir Rosenfeld, Idan Achituve, Jacob Goldberger, Arnon Netzer
PDF
DEAL: Disentangle and Localize Concept-Level Explanations for VLMs Tang Li, Mengmeng Ma, Xi Peng
PDF
Debiasing Surgeon: Fantastic Weights and How to Find Them Remi Nahon, Ivan Luiz De Moura Matos, Van-Tam Nguyen, Enzo Tartaglione
PDF
Deblur E-NeRF: NeRF from Motion-Blurred Events Under High-Speed or Low-Light Conditions Weng Fei Low, Gim Hee Lee
PDF
Deblurring 3D Gaussian Splatting Byeonghyeon Lee, Howoong Lee, Xiangyu Sun, Usman Ali, Eunbyung Park
PDF
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism Zhen Wang, Xinyun Jiang, Jun Xiao, Tao Chen, Long Chen
PDF
DecentNeRFs: Decentralized Neural Radiance Fields from Crowdsourced Images Zaid Tasneem, Akshat Dave, Abhishek Singh, Kushagra Tiwary, Praneeth Vepakomma, Ashok Veeraraghavan, Ramesh Raskar
PDF
DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation Rakshith Subramanyam, Kowshik Thopalli, Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan
PDF
Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models Reza Abbasi, Mohammad Rohban, Mahdieh Soleymani Baghshah
PDF
DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency Xiaojing Zhong, Xinyi Huang, Xiaofeng Yang, Guosheng Lin, Qingyao Wu
PDF
DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement Qimin Chen, Zhiqin Chen, Vladimir G. Kim, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri
PDF
Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation Zhao Zhe, Mengshi Qi, Huadong Ma
PDF
Decomposition Betters Tracking Everything Everywhere Rui Li, Dong Liu
PDF
Decomposition of Neural Discrete Representations for Large-Scale 3D Mapping Minseong Park, Suhan Woo, Euntai Kim
PDF
Decoupling Common and Unique Representations for Multimodal Self-Supervised Learning Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham, Chenying Liu, Zhitong Xiong, Xiao Xiang Zhu
PDF
Deep Companion Learning: Enhancing Generalization Through Historical Consistency Ruizhao Zhu, Venkatesh Saligrama
PDF
Deep Cost Ray Fusion for Sparse Depth Video Completion Jungeon Kim, Soongjin Kim, Jaesik Park, Seungyong Lee
PDF
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems Hyungjin Chung, Jong Chul Ye
PDF
Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks Cheng Gong, Yao Chen, Qiuyang Luo, Ye Lu, Tao Li, Yuzhi Zhang, Yufei Sun, Le Zhang
PDF
Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time Chiao-An Yang, Ziwei Liu, Raymond Yeh
PDF
Deep Online Probability Aggregation Clustering Yuxuan Yan, Na Lu, Ruofan Yan
PDF
Deep Patch Visual SLAM Lahav Lipson, Zachary Teed, Jia Deng
PDF
Deep Polarization Cues for Single-Shot Shape and Subsurface Scattering Estimation Chenhao Li, Trung Thanh Ngo, Hajime Nagahara
PDF
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu, Hongsheng Li
PDF
Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics Shuai Yang, ZhiFei Chen, Pengguang Chen, Xi Fang, Yixun Liang, Shu Liu, Yingcong Chen
PDF
Delving Deep into Engagement Prediction of Short Videos Dasong Li, Wenjie Li, Baili Lu, Hongsheng Li, Sizhuo Ma, Gurunandan Krishnan, Jian Wang
PDF
Delving into Adversarial Robustness on Document Tampering Localization Huiru Shao, Zhuang Qian, Kaizhu Huang, Wei Wang, Xiaowei Huang, Qiufeng Wang
PDF
Denoising Vision Transformers Jiawei Yang, Katie Z Luo, Jiefeng Li, Congyue Deng, Leonidas Guibas, Dilip Krishnan, Kilian Weinberger, Yonglong Tian, Yue Wang
PDF
denoiSplit: A Method for Joint Microscopy Image Splitting and Unsupervised Denoising Ashesh Ashesh, Florian Jug
PDF
Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics Woojin Cho, Jihyun Lee, Minjae Yi, Minje Kim, Taeyun Woo, Donghwan Kim, Taewook Ha, Hyokeun Lee, Je-Hwan Ryu, Woontack Woo, Tae-Kyun Kim
PDF
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding Ruihuang Li, Zhengqiang Zhang, Chenhang He, Zhiyuan Ma, Vishal Patel, Lei Zhang
PDF
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs DongHyun Kim, Byeongho Heo, Dongyoon Han
PDF
Dependency-Aware Differentiable Neural Architecture Search Buang Zhang, Xinle Wu, Hao Miao, Bin Yang, Chenjuan Guo
PDF
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks Sarah Jabbour, Gregory Kondas, Ella Kazerooni, Michael Sjoding, David Fouhey, Jenna Wiens
PDF
Depicting Beyond Scores: Advancing Image Quality Assessment Through Multi-Modal Language Models Zhiyuan You, Zheyuan Li, Jinjin Gu, Zhenfei Yin, Tianfan Xue, Chao Dong
PDF
Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia
PDF
Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery Chao Wang, Zhedong Zheng, Ruijie Quan, Yi Yang
PDF
Depth-Guided NeRF Training via Earth Mover’s Distance Anita Rau, Josiah Aklilu, Floyd C Holsinger, Serena Yeung-Levy
PDF
DetailSemNet: Elevating Signature Verification Through Detail-Semantic Integration Meng-Cheng Shih, Tsai-Ling Huang, Yu-Heng Shih, Hong-Han Shuai, Hsuan-Tung Liu, Yi-Ren Yeh, Ching-Chun Huang
PDF
Detecting as Labeling: Rethinking LiDAR-Camera Fusion in 3D Object Detection Junjie Huang, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du
PDF
DeTra: A Unified Model for Object Detection and Trajectory Forecasting Sergio Casas, Ben T Agro, Jiageng Mao, Thomas Gilles, Alexander Y Cui, Enxu Li, Raquel Urtasun
PDF
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM Yixuan Wu, Yizhou Wang, Shixiang Tang, Wenhao Wu, Tong He, Wanli Ouyang, Philip Torr, Jian Wu
PDF
DEVIAS: Learning Disentangled Video Representations of Action and Scene Kyungho Bae, Youngrae Kim, Geo Ahn, Jinwoo Choi
PDF
DG-PIC: Domain Generalized Point-in-Context Learning for Point Cloud Understanding Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, Jian Jun Zhang
PDF
DGD: Dynamic 3D Gaussians Distillation Isaac Labe, Noam Issachar, Itai Lang, Sagie Benaim
PDF
DGE: Direct Gaussian 3D Editing by Consistent Multi-View Editing Minghao Chen, Iro Laina, Andrea Vedaldi
PDF
DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc Van Gool, Konrad Schindler, Anton Obukhov
PDF
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification Wenhui Zhu, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
PDF
DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation Sanghyun Jo, Fei Pan, In-Jae Yu, Kyungsu Kim
PDF
Diagnosing and Re-Learning for Balanced Multimodal Learning Yake Wei, Siwei Li, Ruoxuan Feng, Di Hu
PDF
DIAL: Dense Image-Text ALignment for Weakly Supervised Semantic Segmentation Soojin Jang, JungMin Yun, JuneHyoung Kwon, Eunju Lee, YoungBin Kim
PDF
Diff-Reg: Diffusion Model in Doubly Stochastic Matrix Space for Registration Problem Qianliang Wu, Haobo Jiang, Lei Luo, Jun Li, Yaqing Ding, Jin Xie, Jian Yang
PDF
Diff-Tracker: Text-to-Image Diffusion Models Are Unsupervised Trackers Zhengbo Zhang, Li Xu, Duo Peng, Hossein Rahmani, Jun Liu
PDF
Diff3DETR: Agent-Based Diffusion Model for Semi-Supervised 3D Object Detection Jiacheng Deng, Jiahao Lu, Tianzhu Zhang
PDF
DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Yu Qiao, Wanli Ouyang, Chao Dong
PDF
DiffCD: A Symmetric Differentiable Chamfer Distance for Neural Implicit Surface Fitting Linus Härenstam-Nielsen, Lu Sang, Abhishek Saroha, Nikita Araslanov, Daniel Cremers
PDF
DiffClass: Diffusion-Based Class Incremental Learning Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang
PDF
DIFFender: Diffusion-Based Adversarial Defense Against Patch Attacks Caixin Kang, Yinpeng Dong, Zhengyi Wang, Shouwei Ruan, Yubo Chen, Hang Su, Xingxing Wei
PDF
Differentiable Convex Polyhedra Optimization from Multi-View Images Daxuan Ren, Haiyi Mei, Hezi Shi, Jianmin Zheng, Jianfei Cai, Lei Yang
PDF
Differentiable Product Quantization for Memory Efficient Camera Relocalization Zakaria Laskar, Iaroslav Melekhov, Assia Benbihi, Shuzhe Wang, Juho Kannala
PDF
DiffFAS: Face Anti-Spoofing via Generative Diffusion Models Xinxu Ge, Xin Liu, Zitong Yu, Jingang Shi, Chun Qi, Jie Li, Heikki Kälviäinen
PDF
DiffiT: Diffusion Vision Transformers for Image Generation Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat
PDF
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction Yanlong Li, Chamara Madarasingha, Kanchana Thilakarathna
PDF
DiffSurf: A Transformer-Based Diffusion Model for Generating and Reconstructing 3D Surfaces in Pose Yusuke Yoshiyasu, Leyuan Sun
PDF
DiffuMatting: Synthesizing Arbitrary Objects with Matting-Level Annotation Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, ZhengKai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji
PDF
Diffusion Bridges for 3D Point Cloud Denoising Mathias Vogel Hüni, Keisuke Tateno, Marc Pollefeys, Federico Tombari, Marie-Julie Rakotosaona, Francis Engelmann
PDF
Diffusion for Natural Image Matting Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, Humphrey Shi
PDF
Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond Silvio Galesso, Philipp Schröppel, Hssan Driss, Thomas Brox
PDF
Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation Duy Tho Le, Hengcan Shi, Jianfei Cai, Hamid Rezatofighi
PDF
Diffusion Model Is a Good Pose Estimator from 3D RF-Vision Junqiao Fan, Jianfei Yang, Yuecong Xu, Lihua Xie
PDF
Diffusion Models Are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tong
PDF
Diffusion Models as Data Mining Tools Ioannis Siglidis, Aleksander Holynski, Alexei A. Efros, Mathieu Aubry, Shiry Ginosar
PDF
Diffusion Models as Optimizers for Efficient Planning in Offline RL Renming Huang, Yunqiang Pei, Guoqing Wang, Yangming Zhang, Yang Yang, Peng Wang, Heng Tao Shen
PDF
Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi
PDF
Diffusion Models for Open-Vocabulary Segmentation Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht
PDF
Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems Sojin Lee, Dogyun Park, Inho Kong, Hyunwoo J. Kim
PDF
Diffusion Reward: Learning Rewards via Conditional Video Diffusion Tao Huang, Guangqi Jiang, Yanjie Ze, Huazhe Xu
PDF
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models Benjamin J Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
PDF
Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation Junsung Lee, Minsoo Kang, Bohyung Han
PDF
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning Jinglin Liang, Jin Zhong, Hanlin Gu, Zhongqi Lu, Xingxing Tang, Gang Dai, Shuangping Huang, Lixin Fan, Qiang Yang
PDF
Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction Xinhang Liu, Jiaben Chen, Shiu-Hong Kao, Yu-Wing Tai, Chi-Keung Tang
PDF
Diffusion-Guided Weakly Supervised Semantic Segmentation Sung-Hoon Yoon, Hoyong Kwon, Jaeseok Jeong, Daehee Park, Kuk-Jin Yoon
PDF
Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following Qiaomu Miao, Alexandros Graikos, Jingwei Zhang, Sounak Mondal, Minh Hoai, Dimitris Samaras
PDF
DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation Yiqun Duan, Xianda Guo, Zheng Zhu
PDF
DiffusionPen: Towards Controlling the Style of Handwritten Text Generation Konstantina Nikolaidou, George Retsinas, Giorgos Sfikas, Marcus Liwicki
PDF
DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays Xuhui Liu, Zhi Qiao, Runkun Liu, Hong Li, Xiantong Zhen, Zhen Qian, Juan Zhang, Baochang Zhang
PDF
DIM: Dyadic Interaction Modeling for Social Behavior Generation Minh Tran, Di Chang, Maksim Siniukov, Mohammad Soleymani
PDF
DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video Narek Tumanyan, Assaf Singer, Shai Bagon, Tali Dekel
PDF
Direct Distillation Between Different Domains Jialiang Tang, Shuo Chen, Gang Niu, Hongyuan Zhu, Joey Tianyi Zhou, Chen Gong, Masashi Sugiyama
PDF
DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-Level Control Xinyu Xu, Shengcheng Luo, Yanchao Yang, Yong-Lu Li, Cewu Lu
PDF
DiscoMatch: Fast Discrete Optimisation for Geometrically Consistent 3D Shape Matching Paul Roetzer, Ahmed Abbas, Dongliang Cao, Florian Bernard, Paul Swoboda
PDF
Discover-Then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery Sukrut Rao, Sweta Mahajan, Moritz Böhle, Bernt Schiele
PDF
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning Sanjoy Kundu, Shubham Trehan, Sathyanarayanan N Aakur
PDF
Discovering Unwritten Visual Classifiers with Large Language Models Mia Chiquier, Utkarsh Mall, Carl Vondrick
PDF
Disentangled Clothed Avatar Generation from Text Descriptions Jionghao Wang, Yuan Liu, Zhiyang Dou, Zhengming Yu, Yongqing Liang, Cheng Lin, Rong Xie, Li Song, Xin Li, Wenping Wang
PDF
Disentangled Generation and Aggregation for Robust Radiance Fields Shihe Shen, Huachen Gao, Wangze Xu, Rui Peng, Luyang Tang, Kaiqiang Xiong, Jianbo Jiao, Ronggang Wang
PDF
Disentangling Masked Autoencoders for Unsupervised Domain Generalization An Zhang, Han Wang, Xiang Wang, Tat-Seng Chua
PDF
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions Jin Gao, Lei Gan, Yuankai Li, Yixin Ye, Dequan Wang
PDF
Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection Jian Shi, Pengyi Zhang, Ni Zhang, Hakim Ghazzai, Peter Wonka
PDF
Distill Gold from Massive Ores: Bi-Level Data Pruning Towards Efficient Dataset Distillation Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang
PDF
Distilling Diffusion Models into Conditional GANs MinGuk Kang, Richard Zhang, Connelly Barnes, Sylvain Paris, Suha Kwak, Jaesik Park, Eli Shechtman, Jun-Yan Zhu, Taesung Park
PDF
Distilling Knowledge from Large-Scale Image Models for Object Detection Gang Li, Wenhai Wang, Xiang Li, Ziheng Li, Jian Yang, Jifeng Dai, Yu Qiao, Shanshan Zhang
PDF
Distractor-Free Novel View Synthesis via Exploiting Memorization Effect in Optimization Yukun Wang, Kunhong Li, Minglin Chen, Longguang Wang, Shunbo Zhou, Kaiwen Xue, Yulan Guo
PDF
Distractors-Immune Representation Learning with Cross-Modal Contrastive Regularization for Change Captioning Yunbin Tu, Liang Li, Li Su, Chenggang Yan, Qingming Huang
PDF
Distributed Active Client Selection with Noisy Clients Using Model Association Scores Kwang In Kim
PDF
Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding Danish Nazir, Timo Bartels, Jan Piewek, Thorsten Bagdonat, Tim Fingscheidt
PDF
Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams Ziqiang Wang, Zhixiang Chi, Yanan Wu, Li Gu, Zhi Liu, Konstantinos N Plataniotis, Yang Wang
PDF
Distribution-Aware Robust Learning from Long-Tailed Data with Noisy Labels Jae Soon Baik, In Young Yoon, Kun Hoon Kim, Jun Won Choi
PDF
Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification Dekun Lin, Zhe Cui, Rui Chen, Tailai Peng, Xinran Xie, Xiaolin Qin
PDF
Diverse Text-to-3D Synthesis with Augmented Text Embedding Uy Dieu Tran, Minh N. Hoang Luu, Phong Ha Nguyen, Khoi Nguyen, Binh-Son Hua
PDF
Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images Tianyu Luan, Zhongpai Gao, Luyuan Xie, Abhishek Sharma, Hao Ding, Benjamin Planche, Meng Zheng, Ange Lou, Terrence Chen, Junsong Yuan, Ziyan Wu
PDF
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang, Jie Yang, Ying Shan, Yan-Pei Cao, Lin Gao
PDF
DNI: Dilutional Noise Initialization for Diffusion Video Editing Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong, Chang D. Yoo
PDF
Do Generalised Classifiers Really Work on Human Drawn Sketches? Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song
PDF
Do Text-Free Diffusion Models Learn Discriminative Visual Representations? Soumik Mukhopadhyay, Matthew A Gwilliam, Yosuke Yamaguchi, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Jun Ohya, Abhinav Shrivastava
PDF
DOCCI: Descriptions of Connected and Contrasting Images Yasumasa Onoe, Sunayana Rane, Zachary E Berger, Yonatan Bitton, Jaemin Cho, Roopal Garg, Alexander Ku, Zarana Parekh, Jordi Pont-Tuset, Garrett Tanzer, Su Wang, Jason M Baldridge
PDF
Dolfin: Diffusion Layout Transformers Without Autoencoder Yilin Wang, Zeyuan Chen, Liangjun Zhong, Zheng Ding, Zhuowen Tu
PDF
Dolphins: Multimodal Language Model for Driving Yingzi Ma, Yulong Cao, Jiachen Sun, Marco Pavone, Chaowei Xiao
PDF
Domain Generalization of 3D Object Detection by Density-Resampling Shuangzhi Li, Lei Ma, Xingyu Li
PDF
Domain Reduction Strategy for Non-Line-of-Sight Imaging Hyunbo Shim, In Cho, Daekyu Kwon, Seon Joo Kim
PDF
Domain Shifting: A Generalized Solution for Heterogeneous Cross-Modality Person Re-Identification Yan Jiang, Xu Cheng, Hao Yu, Xingyu Liu, Haoyu Chen, Guoying Zhao
PDF
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions Yihao Ai, Yifei Qi, Bo Wang, Yu Cheng, Xinchao Wang, Robby T. Tan
PDF
Domain-Adaptive Video Deblurring via Test-Time Blurring Jin-Ting He, Fu-Jen Tsai, Jia-Hao Wu, Yan-Tsung Peng, Chung-Chi Tsai, Chia-Wen Lin, Yen-Yu Lin
PDF
DomainFusion: Generalizing to Unseen Domains with Latent Diffusion Models Yuyang Huang, Yabo Chen, Yuchen Liu, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
PDF
Domesticating SAM for Breast Ultrasound Image Segmentation via Spatial-Frequency Fusion and Uncertainty Correction Wanting Zhang, Huisi Wu, Jing Qin
PDF
DoubleTake: Geometry Guided Depth Estimation Mohamed Sayed, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Guillermo Garcia-Hernando, Gabriel Brostow, Sara Vicente, Michael Firman
PDF
DoughNet: A Visual Predictive Model for Topological Manipulation of Deformable Objects Dominik Bauer, Zhenjia Xu, Shuran Song
PDF
DPA-Net: Structured 3D Abstraction from Sparse Views via Differentiable Primitive Assembly Fenggen Yu, Yiming Qian, Xu Zhang, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang
PDF
DQ-DETR: DETR with Dynamic Query for Tiny Object Detection Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng
PDF
Drag Anything: Motion Control for Anything Using Entity Representation Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Zhang Di
PDF
DragAPart: Learning a Part-Level Motion Prior for Articulated Objects Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
PDF
DragVideo: Interactive Drag-Style Video Editing Yufan Deng, Ruida Wang, Yuhao Zhang, Yu-Wing Tai, Chi-Keung Tang
PDF
DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment Yunpeng Bai, Xintao Wang, Yan-Pei Cao, Yixiao Ge, Chun Yuan, Ying Shan
PDF
DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors Zizheng Yan, Jiapeng Zhou, Fanpeng Meng, Yushuang Wu, Lingteng Qiu, Zisheng Ye, Shuguang Cui, Guanying Chen, Xiaoguang Han
PDF
DreamDrone: Text-to-Image Diffusion Models Are Zero-Shot Perpetual View Generators Hanyang Kong, Dongze Lian, Michael Bi Mi, Xinchao Wang
PDF
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation Haibo Yang, Yang Chen, Yingwei Pan, Ting Yao, Zhineng Chen, Zuxuan Wu, Yu-Gang Jiang, Tao Mei
PDF
DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing Hyeonho Jeong, Jinho Chang, Geon Yeong Park, Jong Chul Ye
PDF
DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion Liao Shen, Tianqi Liu, Huiqiang Sun, Xinyi Ye, Baopu Li, Jianming Zhang, Zhiguo Cao
PDF
DreamReward: Aligning Human Preference in Text-to-3D Generation Junliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhu
PDF
DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation Jeongsol Kim, Geon Yeong Park, Jong Chul Ye
PDF
DreamScene: 3D Gaussian-Based Text-to-3D Scene Generation via Formation Pattern Sampling Haoran Li, Haolin Shi, Wenli Zhang, Wenjun Wu, Yong Liao, Lin Wang, Lik-Hang Lee, Peng Yuan Zhou
PDF
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas K Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi
PDF
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation Yi-Hao Peng, Faria Huq, Yue Jiang, Jason Wu, Xin Yue Li, Jeffrey Bigham, Amy Pavel
PDF
DreamView: Injecting View-Specific Text Guidance into Text-to-3D Generation Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, Ancong Wu, Wei-Shi Zheng
PDF
DriveDreamer: Towards Real-World-Driven World Models for Autonomous Driving Xiaofeng Wang, Zheng Zhu, Guan Huang, Chen Xinze, Jiagang Zhu, Jiwen Lu
PDF
DriveLM: Driving with Graph Visual Question Answering Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Zhang Hanxue, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
PDF
DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model Li Xiaofan, Zhang Yifu, Ye Xiaoqing
PDF
Dropout Mixture Low-Rank Adaptation for Visual Parameters-Efficient Fine-Tuning Zhengyi Fang, Yue Wang, Ran Yi, Lizhuang Ma
PDF
DSA: Discriminative Scatter Analysis for Early Smoke Segmentation Lujian Yao, Haitao Zhao, Jingchao Peng, Zhongze Wang, Kaijie Zhao
PDF
DSMix: Distortion-Induced Saliency mAP Based Pre-Training for No-Reference Image Quality Assessment Jinsong Shi, Pan Gao, Xiaojiang Peng, Jie Qin
PDF
Dual-Camera Smooth Zoom on Mobile Phones Renlong Wu, Zhilu Zhang, Yu Yang, Wangmeng Zuo
PDF
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning Jia-Hao Xiao, Ming-Kun Xie, Heng-Bo Fan, Gang Niu, Masashi Sugiyama, Sheng-Jun Huang
PDF
Dual-Level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation Ruijie Xu, Chuyu Zhang, Hui Ren, Xuming He
PDF
Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-Time Adaptation Yushun Tang, Shuoshuo Chen, Zhihe Lu, Xinchao Wang, Zhihai He
PDF
Dual-Rain: Video Rain Removal Using Assertive and Gentle Teachers Tingting Chen, Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Robby T. Tan
PDF
Dual-Stage Hyperspectral Image Classification Model with Spectral Supertoken Peifu Liu, Tingfa Xu, Jie Wang, Huan Chen, Huiyan Bai, Jianan Li
PDF
DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences Peidong Li, Wancheng Shen, Qihao Huang, Dixiao Cui
PDF
DualDn: Dual-Domain Denoising via Differentiable ISP Ruikang Li, Yujin Wang, Shiqi Chen, Fan Zhang, Jinwei Gu, Tianfan Xue
PDF
DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wang
PDF
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection Le Yang, Ziwei Zheng, Yizeng Han, Hao Cheng, Shiji Song, Gao Huang, Fan Li
PDF
Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition Yurong Zhang, Honghao Chen, Zhang Xinyu, Xiangxiang Chu, Li Song
PDF
Dynamic Data Selection for Efficient SSL via Coarse-to-Fine Refinement Aditay Tripathi, Pradeep Shenoy, Anirban Chakraborty
PDF
Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge Hyejin Park, Dongbo Min
PDF
Dynamic Neural Radiance Field from Defocused Monocular Video Xianrui Luo, Huiqiang Sun, Juewen Peng, Zhiguo Cao
PDF
Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection Trinh Le Ba Khanh, Huy-Hung Nguyen, Long Hoang Pham, Duong Nguyen-Ngoc Tran, Jae Wook Jeon
PDF
DynamiCrafter: Animating Open-Domain Images with Video Diffusion Priors Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong
PDF
DynMF: Neural Motion Factorization for Real-Time Dynamic View Synthesis with 3D Gaussian Splatting Angelos Kratimenos, Jiahui Lei, Kostas Daniilidis
PDF
DynoSurf: Neural Deformation-Based Temporally Consistent Dynamic Surface Reconstruction Yuxin Yao, Siyu Ren, Junhui Hou, Zhi Deng, Juyong Zhang, Wenping Wang
PDF
DySeT: A Dynamic Masked Self-Distillation Approach for Robust Trajectory Prediction Mozghan Pourkeshavarz, Arielle Zhang, Amir Rasouli
PDF
DεpS: Delayed Ε-Shrinking for Faster Once-for-All Training Aditya Annavajjala, Alind Khare, Animesh Agrawal, Igor Fedorov, Hugo M Latapie, Myungjin Lee, Alexey Tumanov
PDF
E.T. the Exceptional Trajectory: Text-to-Camera-Trajectory Generation with Character Awareness Robin Courant, Nicolas Dufour, Xi Wang, Marc Christie, Vicky Kalogeiton
PDF
E3M: Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation Peijun Bao, Zihao Shao, Wenhan Yang, Boon Poh Ng, Alex Kot
PDF
E3V-K5: An Authentic Benchmark for Redefining Video-Based Energy Expenditure Estimation Shengxuming Zhang, Lei Jin, Yifan Wang, Xinyu Wang, Xu Wen, Zunlei Feng, Mingli Song
PDF
EA-VTR: Event-Aware Video-Text Retrieval Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Chunfeng Yuan, Bing Li, Yingmin Luo, Xu Li, Xiaojuan Qi, Ying Shan, Weiming Hu
PDF
EAFormer: Scene Text Segmentation with Edge-Aware Transformers Haiyang Yu, Teng Fu, Bin Li, Xiangyang Xue
PDF
EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS Sharath Girish, Kamal Gupta, Abhinav Shrivastava
PDF
Early Anticipation of Driving Maneuvers Abdul Wasi Lone, Shankar Gangisetty, Shyam Nandan Rai, C. V. Jawahar
PDF
Early Preparation Pays Off: New Classifier Pre-Tuning for Class Incremental Semantic Segmentation Zhengyuan Xie, Haiquan Lu, Jia-wen Xiao, Enguang Wang, Le Zhang, Xialei Liu
PDF
EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-Based Detection with Recurrent Spiking Neural Networks Ziming Wang, Ziling Wang, Huaning Li, Lang Qin, Runhao Jiang, De Ma, Huajin Tang
PDF
Easing 3D Pattern Reasoning with Side-View Features for Semantic Scene Completion Linxi Huan, Mingyue Dong, Linwei Yue, Shuhan Shen, Xianwei Zheng
PDF
EBDM: Exemplar-Guided Image Translation with Brownian-Bridge Diffusion Models Eungbean Lee, Somi Jeong, Kwanghoon Sohn
PDF
Echoes of the past: Boosting Long-Tail Recognition via Reflective Learning Qihao Zhao, Yalun Dai, Shen Lin, Wei Hu, Fan Zhang, Jun Liu
PDF
EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam
PDF
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-Free Image Matching Peiqi Chen, Lei Yu, Yi Wan, Yongjun Zhang, Jian Wang, Liheng Zhong, Jingdong Chen, Ming Yang
PDF
EDformer: Transformer-Based Event Denoising Across Varied Noise Levels Bin Jiang, Bo Xiong, Bohan Qu, M. Salman Asif, You Zhou, Zhan Ma
PDF
Edge-Guided Fusion and Motion Augmentation for Event-Image Stereo Fengan Zhao, Qianang Zhou, Junlin Xiong
PDF
Editable Image Elements for Controllable Synthesis Jiteng Mu, Michaël Gharbi, Richard Zhang, Eli Shechtman, Nuno Vasconcelos, Xiaolong Wang, Taesung Park
PDF
EditShield: Protecting Unauthorized Image Editing by Instruction-Guided Diffusion Models Ruoxi Chen, Haibo Jin, Yixin Liu, Jinyin Chen, Haohan Wang, Lichao Sun
PDF
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis Shuai Tan, Bin Ji, Mengxiao Bi, Ye Pan
PDF
Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer Qinji Yu, Yirui Wang, Ke Yan, Haoshen Li, Dazhou Guo, Li Zhang, Na Shen, Qifeng Wang, Xiaowei Ding, Le Lu, Xianghua Ye, Dakai Jin
PDF
Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning Amandeep Kumar, Muhammad Awais, Sanath Narayan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer
PDF
Efficient Active Domain Adaptation for Semantic Segmentation by Selecting Information-Rich Superpixels Yuan Gao, Zilei Wang, Yixin Zhang, Bohai Tu
PDF
Efficient and Versatile Robust Fine-Tuning of Zero-Shot Models Sungyeon Kim, Boseung Jeong, Donghyun Kim, Suha Kwak
PDF
Efficient Bias Mitigation Without Privileged Information Mateo Espinosa Zarlenga, Swami Sankaranarayanan, Jerone T. A. Andrews, Zohreh Shams, Mateja Jamnik, Alice Xiang
PDF
Efficient Cascaded Multiscale Adaptive Network for Image Restoration Yichen Zhou, Pan Zhou, Teck Khim Ng
PDF
Efficient Depth-Guided Urban View Synthesis Sheng Miao, Jiaxin Huang, Dongfeng Bai, Weichao Qiu, Liu Bingbing, Andreas Geiger, Yiyi Liao
PDF
Efficient Diffusion Transformer with Step-Wise Dynamic Attention Mediators Yifan Pu, Zhuofan Xia, Jiayi Guo, Dongchen Han, Qixiu Li, Duo Li, Yuhui Yuan, Ji Li, Yizeng Han, Shiji Song, Gao Huang, Xiu Li
PDF
Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation Yeongtak Oh, Jonghyun Lee, Jooyoung Choi, Dahuin Jung, Uiwon Hwang, Sungroh Yoon
PDF
Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning Cong Wu, Xiao-Jun Wu, Linze Li, Tianyang Xu, Zhenhua Feng, Josef Kittler
PDF
Efficient Frequency-Domain Image Deraining with Contrastive Regularization Ning Gao, Xingyu Jiang, Xiuhui Zhang, Yue Deng
PDF
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck
PDF
Efficient Inference of Vision Instruction-Following Models with Elastic Cache Zuyan Liu, Benlin Liu, Jiahui Wang, Yuhao Dong, Guangyi Chen, Yongming Rao, Ranjay Krishna, Jiwen Lu
PDF
Efficient Learning of Event-Based Dense Representation Using Hierarchical Memories with Adaptive Update Uday Kamal, Saibal Mukhopadhyay
PDF
Efficient NeRF Optimization - Not All Samples Remain Equally Hard Juuso Korhonen, Goutham Rangu, Hamed Rezazadegan Tavakoli, Juho Kannala
PDF
Efficient Neural Video Representation with Temporally Coherent Modulation Seungjun Shin, Suji Kim, Dokwan Oh
PDF
Efficient Pre-Training for Localized Instruction Generation of Procedural Videos Anil Batra, Davide Moltisanti, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller
PDF
Efficient Snapshot Spectral Imaging: Calibration-Free Parallel Structure with Aperture Diffraction Fusion Tao Lv, Lihao Hu, Shiqiao Li, Chenglong Huang, Xun Cao
PDF
Efficient Training of Spiking Neural Networks with Multi-Parallel Implicit Stream Architecture Zhigao Cao, Meng Li, Xiashuang Wang, Haoyu Wang, Fan Wang, Youjun Li, Zigang Huang
PDF
Efficient Training with Denoised Neural Weights Yifan Gong, Zheng Zhan, Yanyu Li, Yerlan Idelbayev, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren
PDF
Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras
PDF
Efficient Vision Transformers with Partial Attention Xuan-Thuy Vo, Duy-Linh Nguyen, Adri Priadana, Kang-Hyun Jo
PDF
EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation Nikolai Körber, Eduard Kromer, Andreas Siebert, Sascha Hauke, Daniel Mueller-Gritschneder, Björn Schuller
PDF
EgoBody3M: Egocentric Body Tracking on a VR Headset Using a Diverse Dataset Amy Zhao, Chengcheng Tang, Lezi Wang, Yijing Li, Mihika Dave, Lingling Tao, Christopher D. Twigg, Robert Y. Wang
PDF
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu, Zeynep Akata
PDF
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding Yuan-Ming Li, Wei-Jin Huang, An-Lan Wang, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng
PDF
EgoLifter: Open-World 3D Segmentation for Egocentric Perception Qiao Gu, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney
PDF
EgoPet: Egomotion and Interaction Data from an Animal's Perspective Amir Bar, Arya Bakhtiar, Danny L Tran, Antonio Loquercio, Jathushan Rajasegaran, Yann Lecun, Amir Globerson, Trevor Darrell
PDF
EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali, Linguang Zhang, Elliot J Crowley, Cem Keskin
PDF
EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere Jiaxi Jiang, Paul Streli, Manuel Meier, Christian Holz
PDF
EINet: Point Cloud Completion via Extrapolation and Interpolation Pingping Cai, Canyu Zhang, Lingjia Shi, Lili Wang, Nasrin Imanpour, Song Wang
PDF
Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting Yu Liu, Fatimah binti Khalid, Lei Wang, Youxi Zhang, Cunrui Wang
PDF
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning Mainak Singha, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee
PDF
Eliminating Feature Ambiguity for Few-Shot Segmentation Qianxiong Xu, Guosheng Lin, Chen Change Loy, Cheng Long, Ziyue Li, Rui Zhao
PDF
Eliminating Warping Shakes for Unsupervised Online Video Stitching Lang Nie, Chunyu Lin, Kang Liao, Yun Zhang, Shuaicheng Liu, Rui Ai, Yao Zhao
PDF
ELSE: Efficient Deep Neural Network Inference Through Line-Based Sparsity Exploration Zeqi Zhu, Alberto Garcia-Ortiz, Luc Waeijen, Egor Bondarev, Arash Pourtaherian, Orlando Moreira
PDF
Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders Lucas Stoffl, Andy Bonnetto, Stéphane D'Ascoli, Alexander Mathis
PDF
Elysium: Exploring Object-Level Perception in Videos Through Semantic Integration Using MLLMs Han Wang, Yanjie Wang, Ye Yongjie, Yuxiang Nie, Can Huang
PDF
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation Hyunwoo Yu, Yubin Cho, Beoungwoo Kang, Seunghun Moon, Kyeongbo Kong, Suk-Ju Kang
PDF
Embodied Understanding of Driving Scenarios Yunsong Zhou, Linyan Huang, Qingwen Bu, Jia Zeng, Tianyu Li, Hang Qiu, Hongzi Zhu, Minyi Guo, Yu Qiao, Hongyang Li
PDF
Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois C. Knoll
PDF
EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Human Motion Generation Wenyang Zhou, Zhiyang Dou, Zeyu Cao, Zhouyingcheng Liao, Jingbo Wang, Wenjia Wang, Yuan Liu, Taku Komura, Wenping Wang, Lingjie Liu
PDF
Emergent Visual-Semantic Hierarchies in Image-Text Representations Morris Alper, Hadar Averbuch-Elor
PDF
Emerging Property of Masked Token for Effective Pre-Training Hyesong Choi, Hunsang Lee, Seyoung Joung, Hyejin Park, Jiyeong Kim, Dongbo Min
PDF
EMIE-MAP: Large-Scale Road Surface Reconstruction Based on Explicit Mesh and Implicit Encoding Wenhua Wu, Qi Wang, Guangming Wang, Junping Wang, Tiankun Zhao, Yang Liu, Dongchao Gao, Zhe Liu, Hesheng Wang
PDF
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model Under Weak Conditions Linrui Tian, Qi Wang, Bang Zhang, Liefeng Bo
PDF
EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head Qianyun He, Xinya Ji, Yicheng Gong, Yuanxun Lu, Zhengyu Diao, Linjia Huang, Yao Yao, Siyu Zhu, Zhan Ma, Songcen Xu, Xiaofei Wu, Zixiao Zhang, Xun Cao, Hao Zhu
PDF
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong, Kui Wu, Hai Ci, Chu-ran Wang, Hao Chen
PDF
Encapsulating Knowledge in One Prompt Qi Li, Runpeng Yu, Xinchao Wang
PDF
End-to-End Rate-Distortion Optimized 3D Gaussian Representation Henan Wang, Hanxin Zhu, Tianyu He, Runsen Feng, Jiajun Deng, Jiang Bian, Zhibo Chen
PDF
Energy-Clibrated VAE with Test Time Free Lunch Yihong Luo, Siya Qiu, Xingjian Tao, Yujun Cai, Jing Tang
PDF
Energy-Induced Explicit Quantification for Multi-Modality MRI Fusion Xiaoming Qi, Yuan Zhang, Tong Wang, Guanyu Yang, Yueming Jin, Shuo Li
PDF
Enhanced Motion Forecasting with Visual Relation Reasoning Sungjune Kim, Hadam Baek, Seunggwan Lee, Hyung-gun Chi, Hyerin Lim, Jinkyu Kim, Sangpil Kim
PDF
Enhanced Sparsification via Stimulative Training Shengji Tang, Weihao Lin, Hancheng Ye, Peng Ye, Chong Yu, Baopu Li, Tao Chen
PDF
Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment Chong Li, Xuelin Qian, Yun Wang, Jingyang Huo, Xiangyang Xue, Yanwei Fu, Jianfeng Feng
PDF
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning Chaofeng Chen, Annan Wang, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin
PDF
Enhancing Optimization Robustness in 1-Bit Neural Networks Through Stochastic Sign Descent NianHui Guo, Hong Guo, Christoph Meinel, Haojin Yang
PDF
Enhancing Perceptual Quality in Video Super-Resolution Through Temporally-Consistent Detail Synthesis Using Diffusion Models Claudio Rota, Marco Buzzelli, Joost van de Weijer
PDF
Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder Jiajie Fan, Amal Trigui, Thomas Bäck, Hao Wang
PDF
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective Fangzhou Song, Bin Zhu, Yanbin Hao, Shuo Wang
PDF
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models Yang Zhang, Tze Tzun Teoh, Wei Hern Lim, Kenji Kawaguchi
PDF
Enhancing Source-Free Domain Adaptive Object Detection with Low-Confidence Pseudo Label Distillation Ilhoon Yoon, Hyeongjun Kwon, Jin Kim, Junyoung Park, Hyunsung Jang, Kwanghoon Sohn
PDF
Enhancing Tampered Text Detection Through Frequency Feature Fusion and Decomposition Zhongxi Chen, Shen Chen, Taiping Yao, Ke Sun, Shouhong Ding, Xianming Lin, Liujuan Cao, Rongrong Ji
PDF
Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks Zhewei Wu, Ruilong Yu, Qihe Liu, Shuying Cheng, Shilin Qiu, Shijie Zhou
PDF
Enhancing Vectorized mAP Perception with Historical Rasterized Maps Xiaoyu Zhang, Guangwei Liu, Zihao Liu, Ningyi Xu, Yunhui Liu, Ji Zhao
PDF
Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets Qin Lei, Jiang Zhong, Qizhu Dai
PDF
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification Suorong Yang, Furao Shen, Jian Zhao
PDF
EpipolarGAN: Omnidirectional Image Synthesis with Explicit Camera Control Christopher May, Daniel Aliaga
PDF
Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration Xueyang Kang, Zhaoliang Luan, Kourosh Khoshelham, Bing Wang
PDF
Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel
PDF
EraseDraw : Learning to Insert Objects by Erasing Them from Images Alper Canberk, Maksym Bondarenko, Ege Ozguroglu, Ruoshi Liu, Carl Vondrick
PDF
Eta Inversion: Designing an Optimal Eta Function for Diffusion-Based Real Image Editing Wonjun Kang, Kevin Galim, Hyung Il Koo
PDF
Evaluating Text-to-Visual Generation with Image-to-Text Generation Zhiqiu Lin, Deepak Pathak, Baiqi Li, Jiayao Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan
PDF
Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off Levente Halmosi, Bálint Mohos, Márk Jelasity
PDF
Event Camera Data Dense Pre-Training Yan Yang, Liyuan Pan, Liu Liu
PDF
Event Trojan: Asynchronous Event-Based Backdoor Attacks Ruofei Wang, Qing Guo, Haoliang Li, Renjie Wan
PDF
Event-Adapted Video Super-Resolution Zeyu Xiao, Dachun Kai, Yueyi Zhang, Zheng-Jun Zha, Xiaoyan Sun, Zhiwei Xiong
PDF
Event-Aided Time-to-Collision Estimation for Autonomous Driving Jinghang Li, Bangyan Liao, Xiuyuan Lu, Peidong Liu, Shaojie Shen, Yi Zhou
PDF
Event-Based Head Pose Estimation: Benchmark and Method Jiahui Yuan, Hebei Li, Yansong Peng, Jin Wang, Yuheng Jiang, Yueyi Zhang, Xiaoyan Sun
PDF
Event-Based Mosaicing Bundle Adjustment Shuang Guo, Guillermo Gallego
PDF
Event-Based Motion Magnification Yutian Chen, Shi Guo, Yu Fangzheng, Feng Zhang, Jinwei Gu, Tianfan Xue
PDF
EventBind: Learning a Unified Representation to Bind Them All for Event-Based Open-World Understanding Jiazhou Zhou, Xu Zheng, Yuanhuiyi Lyu, Lin Wang
PDF
Every Pixel Has Its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu, ‪Yufeng Jane Tseng
PDF
EvSign: Sign Language Recognition and Translation with Streaming Events Pengyu Zhang, Hao Yin, Zeren Wang, Wenyue Chen, Sheng Ming Li, Dong Wang, Huchuan Lu, Xu Jia
PDF
Ex2Eg-MAE: A Framework for Adaptation of Exocentric Video Masked Autoencoders for Egocentric Social Role Understanding Minh Tran, Yelin Kim, Che-Chun Su, Min Sun, Cheng-Hao Kuo, Mohammad Soleymani
PDF
Exact Diffusion Inversion via Bidirectional Integration Approximation Guoqiang Zhang, J.P. Lewis, W. Bastiaan Kleijn
PDF
Exemplar-Free Continual Representation Learning via Learnable Drift Compensation Alex Gomez-Villa, Dipam Goswami, Kai Wang, Andy Bagdanov, Bartlomiej Twardowski, Joost van de Weijer
PDF
ExMatch: Self-Guided Exploitation for Semi-Supervised Learning with Scarce Labeled Samples Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee
PDF
Expanding Scene Graph Boundaries: Fully Open-Vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention Zuyao Chen, Jinlin Wu, Zhen Lei, Zhaoxiang Zhang, Chang Wen Chen
PDF
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts Andong Tan, Fengtao Zhou, Hao Chen
PDF
Explicitly Guided Information Interaction Network for Cross-Modal Point Cloud Completion Xu Hang, Chen Long, Wenxiao Zhang, Yuan Liu, Zhen Cao, Zhen Dong, Bisheng Yang
PDF
Exploiting Dual-Correlation for Multi-Frame Time-of-Flight Denoising Guanting Dong, Yueyi Zhang, Xiaoyan Sun, Zhiwei Xiong
PDF
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models Minchan Kim, Minyeong Kim, Junik Bae, Suhwan Choi, Sungkyung Kim, Buru Chang
PDF
Exploiting Supervised Poison Vulnerability to Strengthen Self-Supervised Defense Jeremy Styborski, Mingzhi Lyu, Yi Huang, Adams Kong
PDF
Explorative Inbetweening of Time and Space Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Fernandez Abrevaya, Michael J. Black, Xuaner Zhang
PDF
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation Tong Shao, Zhuotao Tian, Hang Zhao, Jingyong Su
PDF
Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling Wonho Bae, Jing Wang, Danica J. Sutherland
PDF
Exploring Conditional Multi-Modal Prompts for Zero-Shot HOI Detection Ting Lei, Shaofeng Yin, Yuxin Peng, Yang Liu
PDF
Exploring Guided Sampling of Conditional GANs Yifei Zhang, Mengfei Xia, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng, Lianghua Huang, Yu Liu, Fan Cheng
PDF
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji
PDF
Exploring Pre-Trained Text-to-Video Diffusion Models for Referring Video Object Segmentation Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua, Zixin Zhu
PDF
Exploring Reliable Matching with Phase Enhancement for Night-Time Semantic Segmentation Yuwen Pan, Rui Sun, Naisong Luo, Tianzhu Zhang, Yongdong Zhang
PDF
Exploring the Feature Extraction and Relation Modeling for Light-Weight Transformer Tracking Jikai Zheng, Mingjiang Liang, Shaoli Huang, Jifeng Ning
PDF
Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data Yanmeng Yao, Xiaohan Zhao, Bin Gu
PDF
Expressive Whole-Body 3D Gaussian Avatar Gyeongsik Moon, Takaaki Shiratori, Shunsuke Saito
PDF
External Knowledge Enhanced 3D Scene Generation from Sketch Zijie Wu, Mingtao Feng, Yaonan Wang, He Xie, Weisheng Dong, Bo Miao, Ajmal Mian
PDF
Eyes Closed, Safety on: Protecting Multimodal LLMs via Image-to-Text Transformation Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James Kwok, Yu Zhang
PDF
F-HOI: Toward Fine-Grained Semantic-Aligned 3D Human-Object Interactions Jie Yang, Xuesong Niu, Nan Jiang, Ruimao Zhang, Siyuan Huang
PDF
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Yue Han, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangning Zhang, Chengjie Wang, Yong Liu
PDF
Face Reconstruction Transfer Attack as Out-of-Distribution Generalization Yoon Gyo Jung, Jaewoo Park, Xingbo Dong, Hojin Park, Andrew Beng Jin Teoh, Octavia Camps
PDF
Faceptor: A Generalist Model for Face Perception Lixiong Qin, Mei Wang, Xuannan Liu, Yuhang Zhang, Wei Deng, Xiaoshuai Song, Weiran Xu, Weihong Deng
PDF
Facial Affective Behavior Analysis with Instruction Tuning Yifan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong
PDF
Factorized Diffusion: Perceptual Illusions by Noise Decomposition Daniel Geng, Inbum Park, Andrew Owens
PDF
Factorizing Text-to-Video Generation by Explicit Image Conditioning Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Mian Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra
PDF
FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation Jingyi Tang, Gu Wang, Zeyu Chen, Shengquan Li, Xiu Li, Xiangyang Ji
PDF
FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang
PDF
Fairness-Aware Vision Transformer via Debiased Self-Attention Yao Qiang, Chengyin Li, Prashant Khanduri, Dongxiao Zhu
PDF
FairViT: Fair Vision Transformer via Adaptive Masking Bowei Tian, Ruijie Du, Yanning Shen
PDF
Fake It till You Make It: Curricular Dynamic Forgery Augmentations Towards General Deepfake Detection Yuzhen Lin, Wentang Song, Bin Li, Yuezun Li, Jiangqun Ni, Han Chen, Qiushi Li
PDF
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance Jiedong Zhuang, Jiaqi Hu, Lianrui Mu, Rui Hu, Xiaoyu Liang, Jiangnan Ye, Haoji Hu
PDF
FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis Vishnu Mani Hema, Shubhra Aich, Christian Haene, Jean-Charles Bazin, Fernando de la Torre
PDF
FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN Riccardo Santambrogio, Marco Cannici, Matteo Matteucci
PDF
Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations Tomáš Chobola, Yu Liu, Hanyi Zhang, Julia A Schnabel, Tingying Peng
PDF
Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation Nina Weng, Paraskevas Pegios, Eike Petersen, Aasa Feragen, Siavash Arjomand Bigdeli
PDF
Fast Encoding and Decoding for Implicit Video Representation Hao Chen, Saining Xie, Ser-Nam Lim, Abhinav Shrivastava
PDF
Fast Point Cloud Geometry Compression with Context-Based Residual Coding and INR-Based Refinement Hao Xu, Xi Zhang, Xiaolin Wu
PDF
Fast Registration of Photorealistic Avatars for VR Facial Animation Chaitanya Patel, Shaojie Bai, Te-Li Wang, Jason Saragih, Shih-En Wei
PDF
Fast Sprite Decomposition from Animated Graphics Tomoyuki Suzuki, Kotaro Kikuchi, Kota Yamaguchi
PDF
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Niessner, Zhenguo Li
PDF
Fast View Synthesis of Casual Videos with Soup-of-Planes Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu
PDF
FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos Florian Maximilian Langer, Jihong Ju, Georgi Dikov, Gerhard Reitmayr, Mohsen Ghafoorian
PDF
FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation Tianyu Zhang, Guocheng Qian, Jin Xie, Jian Yang
PDF
Feature Diversification and Adaptation for Federated Domain Generalization Seunghan Yang, Seokeon Choi, Hyunsin Park, Sungha Choi, Simyung Chang, Sungrack Yun
PDF
Federated Learning with Local Openset Noisy Labels Zonglin Di, Zhaowei Zhu, Xiaoxiao Li, Yang Liu
PDF
FedHARM: Harmonizing Model Architectural Diversity in Federated Learning Anestis Kastellos, Athanasios Psaltis, Charalampos Z Patrikakis, Petros Daras
PDF
FedHide: Federated Learning by Hiding in the Neighbors Hyunsin Park, Sungrack Yun
PDF
FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients Shangchao Su, Bin Li, Xiangyang Xue
PDF
FedTSA: A Cluster-Based Two-Stage Aggregation Method for Model-Heterogeneous Federated Learning Boyu Fan, Chenrui Wu, Xiang Su, Pan Hui
PDF
FedVAD: Enhancing Federated Video Anomaly Detection with GPT-Driven Semantic Distillation Fan Qi, Ruijie Pan, Huaiwen Zhang, Changsheng Xu
PDF
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Keen You, Haotian Zhang, Eldon Schoop, Floris Weers, Amanda Swearngin, Jeff Nichols, Yinfei Yang, Zhe Gan
PDF
Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation Guan Gui, Bin-Bin Gao, Jun Liu, Chengjie Wang, Yunsheng Wu
PDF
Few-Shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt Chenxi Liu, Zhenyi Wang, Tianyi Xiong, Ruibo Chen, Yihan Wu, Junfeng Guo, Heng Huang
PDF
Few-Shot Defect Image Generation Based on Consistency Modeling Qingfeng Shi, Jing Wei, Fei Shen, Zhengtao Zhang
PDF
Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion Yu Cao, Shaogang Gong
PDF
Few-Shot NeRF by Adaptive Rendering Loss Regularization Qingshan Xu, Xuanyu Yi, Jianyao Xu, Wenbing Tao, Yew Soon Ong, Hanwang Zhang
PDF
Find N' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada, Yadan Luo
PDF
Finding a Needle in a Haystack: A Black-Box Approach to Invisible Watermark Detection Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin
PDF
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras Hoonhee Cho, Sung-Hoon Yoon, Hyeokjun Kweon, Kuk-Jin Yoon
PDF
Finding NeMo: Negative-Mined Mosaic Augmentation for Referring Image Segmentation Seongsu Ha, Chaeyun Kim, Donghwa Kim, Junho Lee, Sangho Lee, Joonseok Lee
PDF
Finding Visual Task Vectors Alberto Hojel, Yutong Bai, Trevor Darrell, Amir Globerson, Amir Bar
PDF
Fine-Grained Dynamic Network for Generic Event Boundary Detection Ziwei Zheng, Lijun He, Le Yang, Fan Li
PDF
Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction Yansheng Li, Tingzhu Wang, Kang Wu, Linlin Wang, Xin Guo, Wenbin Wang
PDF
FineMatch: Aspect-Based Fine-Grained Image and Text Mismatch Detection and Correction Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo
PDF
FinePseudo: Improving Pseudo-Labelling Through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition Ishan Rajendrakumar Dave, Mamshad Nayeem Rizve, Mubarak Shah
PDF
FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving Xingtai Gui, Tengteng Huang, Haonan Shao, Haotian Yao, Chi Zhang
PDF
Fisher Calibration for Backdoor-Robust Heterogeneous Federated Learning Wenke Huang, Mang Ye, Zekun Shi, Bo Du, Dacheng Tao
PDF
FisherRF: Active View Selection and Mapping with Radiance Fields Using Fisher Information Wen Jiang, Boshu Lei, Kostas Daniilidis
PDF
Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering Benjamin Attal, Dor Verbin, Ben Mildenhall, Peter Hedman, Jonathan T Barron, Matthew O'Toole, Pratul Srinivasan
PDF
Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats Mingyang Xie, Haoming Cai, Sachin Shah, Yiran Xu, Brandon Y. Feng, Jia-Bin Huang, Christopher A. Metzler
PDF
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally Qiuhong Shen, Xingyi Yang, Xinchao Wang
PDF
FlashTex: Fast Relightable Mesh Texturing with LightControlNet Kangle Deng, Timothy Omernick, Alexander B Weiss, Deva Ramanan, Jun-Yan Zhu, Tinghui Zhou, Maneesh Agrawala
PDF
FLAT: Flux-Aware Imperceptible Adversarial Attacks on 3D Point Clouds Keke Tang, Lujie Huang, Weilong Peng, Daizong Liu, Xiaofei Wang, Yang Ma, Ligang Liu, Zhihong Tian
PDF
Flatness-Aware Sequential Learning Generates Resilient Backdoors Hoang Pham, The-Anh Ta, Anh T Tran, Khoa D Doan
PDF
FlexAttention for Efficient High-Resolution Vision-Language Models Junyan Li, Delin Chen, Tianle Cai, Peihao Chen, Yining Hong, Zhenfang Chen, Yikang Shen, Chuang Gan
PDF
Flexible Distribution Alignment: Towards Long-Tailed Semi-Supervised Learning with Proper Calibration Emanuel Sanchez Aimar, Nathaniel D Helgesen, Yonghao Xu, Marco Kuhlmann, Michael Felsberg
PDF
FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong, Chang D. Yoo
PDF
Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Jinyoung Park, Yooseung Wang, Donguk Kim, Changick Kim
PDF
FlowCon: Out-of-Distribution Detection Using Flow-Based Contrastive Learning Saandeep Aathreya, Shaun Canavan
PDF
Flowed Time of Flight Radiance Fields Mikhail Okunev, Marc Mapeke, Benjamin Attal, Christian Richardt, Matthew O'Toole, James Tompkin
PDF
Flying with Photons: Rendering Novel Views of Propagating Light Anagh Malik, Noah Juravsky, Ryan Po, Gordon Wetzstein, Kiriakos N. Kutulakos, David B. Lindell
PDF
FMBoost: Boosting Latent Diffusion with Flow Matching Johannes S Fischer, Ming Gui, Pingchuan Ma, Nick Stracke, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer
PDF
FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection Jianwei Zhao, Xin Li, Fan Yang, Qiang Zhai, Ao Luo, Zhicheng Jiao, Hong Cheng
PDF
Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo
PDF
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation Xinzhi Mu, Li Chen, Bohan Chen, Shuyang Gu, Jianmin Bao, Dong Chen, Ji Li, Yuhui Yuan
PDF
Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme Jintae Kim, Seungwon Yang, Seong-Gyun Jeong, Chang-Su Kim
PDF
Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation Sudhir Yarram, Junsong Yuan
PDF
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis Qi Sun, Hang Zhou, Wengang Zhou, Li Li, Houqiang Li
PDF
Forget More to Learn More: Domain-Specific Feature Unlearning for Semi-Supervised and Unsupervised Domain Adaptation Hritam Basak, Zhaozheng Yin
PDF
Formula-Supervised Visual-Geometric Pre-Training Ryosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh
PDF
Foster Adaptivity and Balance in Learning with Noisy Labels Mengmeng Sheng, Zeren Sun, Tao Chen, Shuchao Pang, Yucheng Wang, Yazhou Yao
PDF
FoundPose: Unseen Object Pose Estimation with Foundation Features Evin Pınar Örnek, Yann Labbé, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, Tomas Hodan
PDF
Four Ways to Improve Verbo-Visual Fusion for Dense 3D Visual Grounding Ozan Unal, Christos Sakaridis, Suman Saha, Luc Van Gool
PDF
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, Hongsheng Li
PDF
FRDiff : Feature Reuse for Universal Training-Free Acceleration of Diffusion Models Junhyuk So, Jungwon Lee, Eunhyeok Park
PDF
Free Lunch for Gait Recognition: A Novel Relation Descriptor Jilong Wang, Saihui Hou, Yan Huang, Chunshui Cao, Xu Liu, Yongzhen Huang, Tianzhu Zhang, Liang Wang
PDF
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images David Junhao Zhang, Mutian Xu, Jay Zhangjie Wu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou
PDF
Free-Editor: Zero-Shot Text-Driven 3D Scene Editing Nazmul Karim, Hasan Iqbal, Umar Khalid, Chen Chen, Jing Hua
PDF
Free-Viewpoint Video of Outdoor Sports Using a Drone Zhengdong Hong
PDF
Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression Yuan Tian, Guo Lu, Guangtao Zhai
PDF
FreeAugment: Data Augmentation Search Across All Degrees of Freedom Tom Bekor, Niv Nayman, Lihi Zelnik-Manor
PDF
FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior Zhekai Chen, Wen Wang, Zhen Yang, Zeqing Yuan, Hao Chen, Chunhua Shen
PDF
FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models Wei Wu, Qingnan Fan, Shuai Qin, Hong Gu, Ruoyu Zhao, Antoni Chan
PDF
FreeInit: Bridging Initialization Gap in Video Diffusion Models Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu
PDF
FreeMotion: A Unified Framework for Number-Free Text-to-Motion Synthesis Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma
PDF
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi
PDF
FreestyleRet: Retrieving Images from Style-Diversified Queries Hao Li, Yanhao Jia, Peng Jin, Zesen Cheng, Kehan Li, Jialu Sui, Chang Liu, Li Yuan
PDF
Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval Aneeshan Sain, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song
PDF
FreeZe: Training-Free Zero-Shot 6d Pose Estimation with Geometric and Vision Foundation Models Andrea Caraffa, Davide Boscaini, Amir Hamza, Fabio Poiesi
PDF
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation Chenliang Zhou, Fangcheng Zhong, Param Hanji, Zhilin Guo, Kyle Thomas Fogarty, Alejandro Sztrajman, Hongyun Gao, A. Cengiz Oztireli
PDF
Frequency-Spatial Entanglement Learning for Camouflaged Object Detection Yanguang Sun, Chunyan Xu, Jian Yang, Hanyu Xuan, Lei Luo
PDF
FREST: Feature RESToration for Semantic Segmentation Under Multiple Adverse Conditions Sohyun Lee, Namyup Kim, Sungyeon Kim, Suha Kwak
PDF
FRI-Net: Floorplan Reconstruction via Room-Wise Implicit Representation Honghao Xu, Juzhan Xu, Zeyu Huang, Pengfei Xu, Hui Huang, Ruizhen Hu
PDF
From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition Maan Qraitem, Kate Saenko, Bryan A. Plummer
PDF
From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation Yunfei Xie, Cihang Xie, Alan Yuille, Jieru Mei
PDF
Frontier-Enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation Xinru Cui, Qiming Liu, Zhe Liu, Hesheng Wang
PDF
FroSSL: Frobenius Norm Minimization for Efficient Multiview Self-Supervised Learning Oscar Skean, Aayush Dhakal, Nathan Jacobs, Luis G Sanchez Giraldo
PDF
Frugal 3D Point Cloud Model Training via Progressive near Point Filtering and Fused Aggregation Donghyun Lee, Yejin Lee, Jae W. Lee, Hongil Yoon
PDF
FSD-BEV: Foreground Self-Distillation for Multi-View 3D Object Detection Zheng Jiang, Jinqing Zhang, Yanan Zhang, Qingjie Liu, Zhenghui Hu, Baohui Wang, Yunhong Wang
PDF
FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang
PDF
FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion Xiaofeng Wu, Velibor Bojkovic, Bin Gu, Kun Suo, Kai Zou
PDF
Fully Authentic Visual Question Answering Dataset from Online Communities Chongyan Chen, Mengchen Liu, Noel C Codella, Yunsheng Li, Lu Yuan, Danna Gurari
PDF
Fully Sparse 3D Occupancy Prediction Haisong Liu, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang
PDF
Functional Transform-Based Low-Rank Tensor Factorization for Multi-Dimensional Data Recovery Jian-Li Wang, Xi-Le Zhao
PDF
Fundamental Matrix Estimation Using Relative Depths Yaqing Ding, Václav Vávra, Snehal Bhayani, Qianliang Wu, Jian Yang, Zuzana Kukelova
PDF
FunQA: Towards Surprising Video Comprehension Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu
PDF
FuseTeacher: Modality-Fused Encoders Are Strong Vision Supervisors Chen-Wei Xie, Siyang Sun, Liming Zhao, Pandeng Li, Shuailei Ma, Yun Zheng
PDF
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation Rajeev Yasarla, Manish Kumar Singh, Hong Cai, Yunxiao Shi, Jisoo Jeong, Yinhao Zhu, Shizhong Han, Risheek Garrepalli, Fatih Porikli
PDF
FYI: Flip Your Images for Dataset Distillation Byunggwan Son, Youngmin Oh, Donghyeon Baek, Bumsub Ham
PDF
G2fR: Frequency Regularization in Grid-Based Feature Encoding Neural Radiance Fields Shuxiang Xie, Shuyi Zhou, Ken Sakurada, Ryoichi Ishikawa, Masaki Onishi, Takeshi Oishi
PDF
G3R: Gradient Guided Generalizable Reconstruction Yun Chen, Jingkang Wang, Ze Yang, Sivabalan Manivasagam, Raquel Urtasun
PDF
GalLop: Learning Global and Local Prompts for Vision-Language Models Marc Lafon, Elias Ramzi, Clément Rambour, Nicolas Audebert, Nicolas Thome
PDF
GAMMA-FACE: GAussian Mixture Models Amend Diffusion Models for Bias Mitigation in Face Images Basudha Pal, Arunkumar Kannan, Ram Prabhakar Kathirvel, Alice O'Toole, Rama Chellappa
PDF
GAReT: Cross-View Video Geolocalization with Adapters and Auto-Regressive Transformers Manu S Pillai, Mamshad Nayeem Rizve, Mubarak Shah
PDF
GarmentAligner: Text-to-Garment Generation via Retrieval-Augmented Multi-Level Corrections Shiyue Zhang, Zheng Chong, Xujie Zhang, Hanhui Li, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang
PDF
GarmentCodeData: A Dataset of 3D Made-to-Measure Garments with Sewing Patterns Maria Korosteleva, Timur Levent Kesdogan, Fabian Kemper, Stephan Wenninger, Jasmin Koller, Yuhan Zhang, Mario Botsch, Olga Sorkine-Hornung
PDF
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation Olga Zatsarynna, Emad Bahrami, Yazan Abu Farha, Gianpiero Francesca, Jürgen Gall
PDF
GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views Vinayak Gupta, Rongali Simhachala Venkata Girish, Mukund Varma T, Ayush Tewari, Kaushik Mitra
PDF
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing Jing Wu, Jia-Wang Bian, Xinghui Li, Guangrun Wang, Ian Reid, Philip Torr, Victor Adrian Prisacariu
PDF
Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering Antoine Guédon, Vincent Lepetit
PDF
Gaussian Grouping: Segment and Edit Anything in 3D Scenes Mingqiao Ye, Martin Danelljan, Fisher Yu, Lei Ke
PDF
Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang
PDF
Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen, Juho Kannala, Esa Rahtu, Arno Solin
PDF
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang, Jie Zhou, Jiwen Lu
PDF
GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting Xinjie Zhang, Xingtong Ge, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing Geng, Jun Zhang
PDF
GaussReg: Fast 3D Registration with Gaussian Splatting Jiahao Chang, Yinglin Xu, Yihao Li, Yuantao Chen, Wensen Feng, Xiaoguang Han
PDF
Gaze Target Detection Based on Head-Local-Global Coordination Yaokun Yang, Feng Lu
PDF
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths Xianyu Chen, Ming Jiang, Qi Zhao
PDF
General and Task-Oriented Video Segmentation Mu Chen, Liulei Li, Wenguan Wang, Ruijie Quan, Yi Yang
PDF
General Geometry-Aware Weakly Supervised 3D Object Detection Guowen Zhang, Junsong Fan, Liyi Chen, Zhaoxiang Zhang, Zhen Lei, Lei Zhang
PDF
GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features Luc P.J. Sträter, Mohammadreza Salehi, Efstratios Gavves, Cees G.M. Snoek, Yuki M. Asano
PDF
Generalizable Facial Expression Recognition Yuhang Zhang, Xiuqi Zheng, Chenyi Liang, Jiani Hu, Weihong Deng
PDF
Generalizable Human Gaussians for Sparse View Synthesis YoungJoong Kwon, Baole Fang, Yixing Lu, Haoye Dong, Cheng Zhang, Francisco Vicente Carrasco, Albert Mosella-Montoro, Jianjin Xu, Shingo J Takagi, Daeil Kim, Aayush Prakash, Fernando de la Torre
PDF
Generalizable Symbolic Optimizer Learning Xiaotian Song, Peng Zeng, Yanan Sun, Andy Song
PDF
Generalized Coverage for More Robust Low-Budget Active Learning Wonho Bae, Junhyug Noh, Danica J. Sutherland
PDF
Generalizing to Unseen Domains via Text-Guided Augmentation Daiqing Qi, Handong Zhao, Aidong Zhang, Sheng Li
PDF
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina, Enis Simsar, Alperen Tezcan, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Furkan Almas, Irem Dogan, Muhammed Furkan Dasdelen, Chinmay Prabhakar, Hadrien Reynaud, Sarthak Pati, Christian Bluethgen, Mehmet Kemal Ozdemir, Bjoern Menze
PDF
Generating 3D House Wireframes with Semantics Xueqi Ma, Yilin Liu, Wenjun Zhou, Ruowei Wang, Hui Huang
PDF
Generating Human Interaction Motions in Scenes with Text Control Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe
PDF
Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs Aayam Shrestha, Pan Liu, German Ros, Kai Yuan, Alan Fern
PDF
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis Basile Van Hoorick, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick
PDF
Generative End-to-End Autonomous Driving Wenzhao Zheng, Ruiqi Song, Xianda Guo, Chenming Zhang, Long Chen
PDF
GENIXER: Empowering Multimodal Large Language Models as a Powerful Data Generator Henry Hengyuan Zhao, Pan Zhou, Mike Zheng Shou
PDF
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data Yuhang Li, Youngeun Kim, Donghyun Lee, Souvik Kundu, Priyadarshini Panda
PDF
GenRC: Generative 3D Room Completion from Sparse Image Collections Ming-Feng Li, Yueh-Feng Ku, Hong-Xuan Yen, Chi Liu, Yu-Lun Liu, Albert Y Chen, Cheng-Hao Kuo, Min Sun
PDF
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang
PDF
GeoCalib: Learning Single-Image Calibration with Geometric Optimization Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger, Marc Pollefeys
PDF
GeoGaussian: Geometry-Aware Gaussian Splatting for Scene Rendering Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombari
PDF
Geometry Fidelity for Spherical Images Anders Christensen, Nooshin Mojab, Khushman Patel, Karan Ahuja, Zeynep Akata, Ole Winther, Mar Gonzalez Franco, Andrea Colaco
PDF
GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields Xiufeng Huang, Ka Chun Cheung, Simon See, Renjie Wan
PDF
Geospecific View Generation - Geometry-Context Aware High-Resolution Ground View Inference from Satellite Views Ningli Xu, Rongjun Qin
PDF
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long
PDF
Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring Sizhuo Li, Dimitri Gominski, Martin Brandt, Xiaoye Tong, Philippe Ciais
PDF
Getting It Right: Improving Spatial Consistency in Text-to-Image Models Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Guez Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hanna Hajishirzi, Vasudev Lal, Chitta R Baral, Yezhou Yang
PDF
GGRt: Towards Generalizable 3D Gaussians Without Pose Priors in Real-Time Hao Li, Yuanyuan Gao, Dingwen Zhang, Chenming Wu, Yalun Dai, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Junwei Han
PDF
GiT: Towards Generalist Vision Transformer Through Universal Language Interface Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang
PDF
GIVT: Generative Infinite-Vocabulary Transformers Michael Tschannen, Cian Eastwood, Fabian Mentzer
PDF
GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu
PDF
GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection Hang Yao, Ming Liu, Zhicun Yin, Zifei Yan, Xiaopeng Hong, Wangmeng Zuo
PDF
GLARE: Low Light Image Enhancement via Generative Latent Feature Based Codebook Retrieval Han Zhou, Wei Dong, Xiaohong Liu, Shuaicheng Liu, Xiongkuo Min, Guangtao Zhai, Jun Chen
PDF
Global Counterfactual Directions Bartłomiej Sobieski, Przemyslaw Biecek
PDF
Global Structure-from-Motion Revisited Linfei Pan, Daniel Barath, Marc Pollefeys, Johannes L Schönberger
PDF
Global-Local Collaborative Inference with LLM for LiDAR-Based Open-Vocabulary Detection Xingyu Peng, Yan Bai, Chen Gao, Lirong Yang, Fei Xia, Beipeng Mu, Xiaofei Wang, Si Liu
PDF
Global-to-Pixel Regression for Human Mesh Recovery Yabo Xiao, Mingshu He, Dongdong Yu
PDF
GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation Bangyan Liao, Zhenjun Zhao, Lu Chen, Haoang Li, Daniel Cremers, Peidong Liu
PDF
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering Zeyu Liu, Weicong Liang, Zhanhao Liang, Chong Luo, Ji Li, Gao Huang, Yuhui Yuan
PDF
GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring Emanuele Santellani, Martin Zach, Christian Sormann, Mattia Rossi, Andreas Kuhn, Friedrich Fraundorfer
PDF
GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon
PDF
GOEmbed: Gradient Origin Embeddings for Representation Agnostic 3D Feature Learning Animesh Karnewar, Roman Shapovalov, Tom Monnier, Andrea Vedaldi, Niloy J. Mitra, David Novotny
PDF
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed Abdelrahman, Essam Sleiman, Mingchen Zhuge, Jian Ding, Deyao Zhu, Jürgen Schmidhuber, Mohamed Elhoseiny
PDF
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation Amin Parchami-Araghi, Moritz Böhle, Sukrut Rao, Bernt Schiele
PDF
GPSFormer: A Global Perception and Local Structure Fitting-Based Transformer for Point Cloud Understanding Changshuo Wang, Meiqing Wu, Siew-Kei Lam, Xin Ning, Shangshu Yu, Ruiping Wang, Weijun Li, Thambipillai Srikanthan
PDF
GRA: Detecting Oriented Objects Through Group-Wise Rotating and Attention Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang
PDF
GRACE: Graph-Based Contextual Debiasing for Fair Visual Question Answering Yifeng Zhang, Ming Jiang, Qi Zhao
PDF
Gradient-Aware for Class-Imbalanced Semi-Supervised Medical Image Segmentation Wenbo Qi, Jiafei Wu, S. C. Chan
PDF
Gradient-Based Out-of-Distribution Detection Taha Entesari, Sina Sharifi, Bardia Safaei, Vishal Patel, Mahyar Fazlyab
PDF
GRAPE: Generalizable and Robust Multi-View Facial Capture Jing Li, Di Kang, Zhenyu He
PDF
Graph Neural Network Causal Explanation via Neural Causal Models Arman Behnam, Binghui Wang
PDF
GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection Ziying Song, Lei Yang, Shaoqing Xu, Lin Liu, Dongyang Xu, Caiyan Jia, Feiyang Jia, Li Wang
PDF
GraspXL: Generating Grasping Motions for Diverse Objects at Scale Hui Zhang, Sammy Christen, Zicong Fan, Otmar Hilliges, Jie Song
PDF
Gravity-Aligned Rotation Averaging with Circular Regression Linfei Pan, Marc Pollefeys, Daniel Barath
PDF
Grid-Attention: Enhancing Computational Efficiency of Large Vision Models Without Fine-Tuning Pengyu Li, Biao Wang, Tianchu Guo, Xian-Sheng Hua
PDF
GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity Shuo Cao, Yihao Liu, Wenlong Zhang, Yu Qiao, Chao Dong
PDF
Griffon: Spelling Out All Object Locations at Any Granularity with Large Language Models Yufei Zhan, Yousong Zhu, Zhiyang Chen, Fan Yang, Ming Tang, Jinqiao Wang
PDF
GRiT: A Generative Region-to-Text Transformer for Object Understanding Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang
PDF
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein
PDF
GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth Aurélien Cecille, Stefan Duffner, Franck Davoine, Thibault Neveu, Rémi Agier
PDF
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Chuofan Ma, Yi Jiang, Jiannan Wu, Zehuan Yuan, Xiaojuan Qi
PDF
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Qing Jiang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang
PDF
Grounding Image Matching in 3D with MASt3R Vincent Leroy, Yohann Cabon, Jerome Revaud
PDF
Grounding Language Models for Visual Entity Recognition Zilin Xiao, Ming Gong, Paola Cascante-Bonilla, Xingyao Zhang, Jie Wu, Vicente Ordonez
PDF
GroundUp: Rapid Sketch-Based 3D City Massing Gizem Esra Unlu, Mohamed Sayed, Yulia Gryaditskaya, Gabriel Brostow
PDF
Group Testing for Accurate and Efficient Range-Based near Neighbor Search for Plagiarism Detection Harsh Shah, Kashish Mittal, Ajit Rajwade
PDF
GroupDiff: Diffusion-Based Group Portrait Editing Yuming Jiang, Nanxuan Zhao, Qing Liu, Krishna Kumar Singh, Shuai Yang, Chen Change Loy, Ziwei Liu
PDF
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu
PDF
GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence Pengyuan Wang, Takuya Ikeda, Robert Lee, Koichi Nishiwaki
PDF
GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views Yaniv Wolf, Amit Bracha, Ron Kimmel
PDF
GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction Yuxuan Mu, Xinxin Zuo, Chuan Guo, Yilin Wang, Juwei Lu, Xiaofei Wu, Songcen Xu, Peng Dai, Youliang Yan, Li Cheng
PDF
GTMS: A Gradient-Driven Tree-Guided Mask-Free Referring Image Segmentation Method Haoxin Lv, Tianxiong Zhong, Sanyuan Zhao
PDF
GTP-4o: Modality-Prompted Heterogeneous Graph Learning for Omni-Modal Biomedical Representation Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan
PDF
GTPT: Group-Based Token Pruning Transformer for Efficient Human Pose Estimation Haonan Wang, Jie Liu, Jie Tang, Gangshan Wu, Bo Xu, Yanbing Chou, Yong Wang
PDF
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing Vadim Titov, Madina Khalmatova, Alexandra Ivanova, Dmitry P Vetrov, Aibek Alanov
PDF
GVGEN: Text-to-3D Generation with Volumetric Representation Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He
PDF
H-V2X: A Large Scale Highway Dataset for BEV Perception Chang Liu, MingXu Zhu, Cong Ma
PDF
HAC: Hash-Grid Assisted Context for 3D Gaussian Splatting Compression Yihang Chen, Qianyi Wu, Weiyao Lin, Mehrtash Harandi, Jianfei Cai
PDF
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning Zhecan Wang, Garrett Bingham, Adams Wei Yu, Quoc V. Le, Thang Luong, Golnaz Ghiasi
PDF
HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation Wencan Cheng, Eunji Kim, Jong Hwan Ko
PDF
HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning Eugene Valassakis, Guillermo Garcia-Hernando
PDF
Handling the Non-Smooth Challenge in Tensor SVD: A Multi-Objective Tensor Recovery Framework Jingjing Zheng, Wanglong Lu, Wenzhe Wang, Yankai Cao, Xiaoqin Zhang, Xianta Jiang
PDF
HARIVO: Harnessing Text-to-Image Models for Video Generation Mingi Kwon, Seoung Wug Oh, Yang Zhou, Joon-Young Lee, Difan Liu, Haoran Cai, Baqiao Liu, Feng Liu, Youngjung Uh
PDF
Harmonizing Knowledge Transfer in Neural Network with Unified Distillation Yaomin Huang, Faming Fang, Zaoming Yan, Chaomin Shen, Guixu Zhang
PDF
Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation Duo Peng, Zhengbo Zhang, Ping Hu, Qiuhong Ke, David Yau, Jun Liu
PDF
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization Sakib Reza, Yuexi Zhang, Mohsen Moghaddam, Octavia Camps
PDF
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360° Yuxiao He, Yiyu Zhuang, Yanwen Wang, Yao Yao, Siyu Zhu, Xiaoyu Li, Qi Zhang, Xun Cao, Hao Zhu
PDF
HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting Helisa Dhamo, Yinyu Nie, Arthur Moreau, Jifei Song, Richard Shaw, Yiren Zhou, Eduardo Pérez-Pellitero
PDF
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Zhenglin Zhou, Fan Ma, Hehe Fan, Zongxin Yang, Yi Yang
PDF
HENet: Hybrid Encoding for End-to-End Multi-Task 3D Perception from Multi-View Cameras Zhongyu Xia, ZhiWei Lin, Xinhao Wang, Yongtao Wang, Yun Xing, Shengxiang Qi, Nan Dong, Ming-Hsuan Yang
PDF
HERGen: Elevating Radiology Report Generation with Longitudinal Data Fuying Wang, Shenghui Du, Lequan Yu
PDF
Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception Congzhang Shao, Guiyang Luo, Quan Yuan, Yifu Chen, Yilin Liu, Gong Kexin, Jinglin Li
PDF
Heterogeneous Graph Learning for Scene Graph Prediction in 3D Point Clouds Yanni Ma, Hao Liu, Yun Pei, Yulan Guo
PDF
HGL: Hierarchical Geometry Learning for Test-Time Adaptation in 3D Point Cloud Segmentation Tianpei Zou, Sanqing Qu, Zhijun Li, Alois C. Knoll, 何 良华, Guang Chen, Changjun Jiang
PDF
HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models Shen Zhang, Zhaowei Chen, Zhenyu Zhao, Yuhao Chen, Yao Tang, Jiajun Liang
PDF
Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack Mingyu Yang, Daizong Liu, Keke Tang, Pan Zhou, Lixing Chen, Junyang Chen
PDF
HiEI: A Universal Framework for Generating High-Quality Emerging Images from Natural Images Jingmeng Li, Lukang Fu, Surun Yang, Hui Wei
PDF
Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution Mridul Khurana, Arka Daw, M. Maruf, Josef C. Uyeda, Wasila Dahdul, Caleb Charpentier, Yasin Bakış, Henry L. Bart Jr., Paula M. Mabee, Hilmar Lapp, James P. Balhoff, Wei-Lun Chao, Charles Stewart, Tanya Berger-Wolf, Anuj Karpatne
PDF
Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection Xincheng Yao, Ruoqi Li, Zefeng Qian, Lu Wang, Chongyang Zhang
PDF
Hierarchical Separable Video Transformer for Snapshot Compressive Imaging Ping Wang, Yulun Zhang, Lishun Wang, Xin Yuan
PDF
Hierarchical Temporal Context Learning for Camera-Based Semantic Scene Completion Bohan Li, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin, Wenjun Zeng
PDF
Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation Bowei Xing, Xianghua Ying, Ruibin Wang, Ruohao Guo, Ji Shi, Wenzhen Yue
PDF
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos Subin Jeon, In Cho, Minsu Kim, Woong Oh Cho, Seon Joo Kim
PDF
HiFi-123: Towards High-Fidelity One Image to 3D Content Generation Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, Wenbo Hu, Long Quan, Ying Shan, Yonghong Tian
PDF
HiFi-Score: Fine-Grained Image Description Evaluation with Hierarchical Parsing Graphs Ziwei Yao, Ruiping Wang, Xilin Chen
PDF
High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding Qi Zuo, Xiaodong Gu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Qiu Lingteng, Liefeng Bo, Zilong Dong
PDF
High-Fidelity and Transferable NeRF Editing by Frequency Decomposition Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
PDF
High-Fidelity Modeling of Generalizable Wrinkle Deformation Jingfan Guo, Jae Shin Yoon, Shunsuke Saito, Takaaki Shiratori, Hyun Soo Park
PDF
High-Precision Self-Supervised Monocular Depth Estimation with Rich-Resource Prior Jianbing Shen, Wencheng Han
PDF
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering Xin Ming, Jiawei Li, Jingwang Ling, Libo Zhang, Feng Xu
PDF
High-Resolution and Few-Shot View Synthesis from Asymmetric Dual-Lens Inputs Ruikang Xu, Mingde Yao, Yue Li, Yueyi Zhang, Zhiwei Xiong
PDF
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects Xintao Lv, Liang Xu, Yichao Yan, Xin Jin, Congsheng Xu, Wu Shuwen, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang
PDF
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution Xiang Zhang, Yulun Zhang, Fisher Yu
PDF
HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes Zhuopeng Li, Yilin Zhang, Chenming Wu, Jianke Zhu, Liangjun Zhang
PDF
HoloADMM: High-Quality Holographic Complex Field Recovery Mazen Mel, Paul Springer, Pietro Zanuttigh, Haitao Zhou, Alexander Gatto
PDF
Holodepth: Programmable Depth-Varying Projection via Computer-Generated Holography Dorian Chan, Matthew O'Toole, Sizhuo Ma, Jian Wang
PDF
How Far Can a 1-Pixel Camera Go? Solving Vision Tasks Using Photoreceptors and Computationally Designed Visual Morphology Andrei Atanov, Rishubh Singh, Jiawei Fu, Isabella Yu, Andrew Spielberg, Amir Zamir
PDF
How Many Unicorns Are in This Image? a Safety Evaluation Benchmark for Vision LLMs Haoqin Tu, Chenhang Cui, Zijun Wang, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie
PDF
How to Train the Teacher Model for Effective Knowledge Distillation Shayan Mohajer Hamidi, Xizhen Deng, Renhao Tan, Linfeng Ye, Ahmed Hussein Salamah
PDF
How Video Meetings Change Your Expression Sumit Sarin, Utkarsh Mall, Purva Tendulkar, Carl Vondrick
PDF
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale Nina Shvetsova, Anna Kukleva, Xudong Hong, Christian Rupprecht, Bernt Schiele, Hilde Kuehne
PDF
HPE-Li: WiFi-Enabled Lightweight Dual Selective Kernel Convolution for Human Pose Estimation Toan D. Gian, Tien Dac Lai, Thien Van Luong, Kok-Seng Wong, Van-Dinh Nguyen
PDF
HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si
PDF
HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos Lixin Xue, Chen Guo, Chengwei Zheng, Fangjinhua Wang, Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, Jie Song, Otmar Hilliges
PDF
Human Hair Reconstruction with Strand-Aligned 3D Gaussians Egor Zakharov, Vanessa Sklyarova, Michael J. Black, Giljoo Nam, Justus Thies, Otmar Hilliges
PDF
Human Motion Forecasting in Dynamic Domain Shifts: A Homeostatic Continual Test-Time Adaptation Framework Qiongjie Cui, Huaijiang Sun, Bin Li, Jianfeng Lu, Weiqing Li
PDF
Human Pose Recognition via Occlusion-Preserving Abstract Images Saad Manzur, Wayne B Hayes
PDF
Human-in-the-Loop Visual Re-ID for Population Size Estimation Gustavo Perez, Daniel Sheldon, Grant Van Horn, Subhransu Maji
PDF
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-Fine Pose-Reversible Guidance Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang
PDF
HUMOS: Human Motion Model Conditioned on Body Shape Shashank Tripathi, Omid Taheri, Christoph Lassner, Michael J. Black, Daniel Holden, Carsten Stoll
PDF
HVCLIP: High-Dimensional Vector in CLIP for Unsupervised Domain Adaptation Noranart Vesdapunt, Kah Kuen Fu, Yue Wu, Xu Zhang, Pradeep Natarajan
PDF
Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation Kihong Kim, Haneol Lee, Jihye Park, Seyeon Kim, Kwang Hee Lee, Seungryong Kim, Jaejun Yoo
PDF
HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation Shanyan Guan, Yanhao Ge, Ying Tai, Jian Yang, Wei Li, Mingyu You
PDF
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning Fucai Ke, Zhixi Cai, Simindokht Jahangard, Weiqing Wang, Pari Delir Haghighi, Hamid Rezatofighi
PDF
HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun
PDF
Hyperion – A Fast, Versatile Symbolic Gaussian Belief Propagation Framework for Continuous-Time SLAM David Hug, Ignacio Alzugaray, Margarita Chli
PDF
Hypernetworks for Generalizable BRDF Representation Fazilet Gokbudak, Alejandro Sztrajman, Chenliang Zhou, Fangcheng Zhong, Rafal Mantiuk, A. Cengiz Oztireli
PDF
HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions Chiranjeev Chiranjeev, Muskan Dosi, Kartik Thakral, Mayank Vatsa, Richa Singh
PDF
HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis Fangqin Zhou, Mert Kilickaya, Joaquin Vanschoren, Ran Piao
PDF
I Can't Believe It's Not Scene Flow! Ishan Khatri, Kyle Vedder, Neehar Peri, Deva Ramanan, James Hays
PDF
I-MedSAM: Implicit Medical Image Segmentation with Segment Anything Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang
PDF
I2-SLAM: Inverting Imaging Process for Robust Photorealistic Dense SLAM Gwangtak Bae, Changwoon Choi, Hyeongjun Heo, Sang Min Kim, Young Min Kim
PDF
IAM-VFI : Interpolate Any Motion for Video Frame Interpolation with Motion Complexity mAP Kihwan Yoon, Yong Han Kim, Sungjei Kim, Jinwoo Jeong
PDF
Idea2Img: Iterative Self-Refinement with GPT-4V for Automatic Image Design and Generation Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Lijuan Wang
PDF
Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition Lilang Lin, Lehong Wu, Jiahang Zhang, Jiaying Liu
PDF
Identity-Consistent Diffusion Network for Grading Knee Osteoarthritis Progression in Radiographic Imaging Wenhua Wu, Kun Hu, Wenxi Yue, Wei Li, Milena Simic, Changyang Li, Wei Xiang, Zhiyong Wang
PDF
Idling Neurons, Appropriately Lenient Workload During Fine-Tuning Leads to Better Generalization Hongjing Niu, Hanting Li, Bin Li, Feng Zhao
PDF
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation Yuanhao Zhai, Kevin Lin, Linjie Li, Chung-Ching Lin, Jianfeng Wang, Zhengyuan Yang, David Doermann, Junsong Yuan, Zicheng Liu, Lijuan Wang
PDF
IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception Shaohong Wang, Lu Bin, Xinyu Xiao, Zhiyu Xiang, Hangguan Shan, Eryun Liu
PDF
IG Captioner: Information Gain Captioners Are Strong Zero-Shot Classifiers Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan Yuille, Jiahui Yu
PDF
IGNORE: Information Gap-Based False Negative Loss Rejection for Single Positive Multi-Label Learning Gyeong Ryeol Song, Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee
PDF
iHuman: Instant Animatable Digital Humans from Monocular Videos Pramish Paudel, Anubhav Khanal, Danda Pani Paudel, Jyoti Tandukar, Ajad Chhatkuli
PDF
Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation Han Li, Shaohui Li, Shuangrui Ding, Wenrui Dai, Maida Cao, Chenglin Li, Junni Zou, Hongkai Xiong
PDF
Image Demoireing in RAW and sRGB Domains Shuning Xu, Binbin Song, Xiangyu Chen, Xina Liu, Jiantao Zhou
PDF
Image Manipulation Detection with Implicit Neural Representation and Limited Supervision Zhenfei Zhang, Mingyang Li, Xin Li, Ming-Ching Chang, Jun-Wei Hsieh
PDF
Image-Adaptive 3D Lookup Tables for Real-Time Image Enhancement with Bilateral Grids Wontae Kim, Nam Ik Cho
PDF
Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-Supervised Learning Zhiyu Wu, Jinshi Cui
PDF
Image-to-LiDAR Relational Distillation for Autonomous Driving Data Anas Mahmoud, Ali Harakeh, Steven Waslander
PDF
Images Are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models Yifan Li, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen
PDF
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems Ziyuan Luo, Boxin Shi, Haoliang Li, Renjie Wan
PDF
Imaging with Confidence: Uncertainty Quantification for High-Dimensional Undersampled MR Images Frederik Hoppe, Claudio Mayrink Verdun, Hannah Sophie Laus, Sebastian Endt, Marion Irene Menzel, Felix Krahmer, Holger Rauhut
PDF
iMatching: Imperative Correspondence Learning Zitong Zhan, Dasong Gao, Yun-Jou Lin, Youjie Xia, Chen Wang
PDF
IMMA: Immunizing Text-to-Image Models Against Malicious Adaptation Amber Yijia Zheng, Raymond A. Yeh
PDF
Implicit Concept Removal of Diffusion Models Zhili Liu, Kai Chen, Yifan Zhang, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James Kwok
PDF
Implicit Filtering for Learning Neural Signed Distance Functions from 3D Point Clouds Shengtao Li, Ge Gao, Yudong Liu, Ming Gu, Yu-Shen Liu
PDF
Implicit Neural Models to Extract Heart Rate from Video Pradyumna Chari, Anirudh Bindiganavale Harish, Adnan Armouti, Alexander Vilesov, Sanjit Sarda, Laleh Jalilian, Achuta Kadambi
PDF
Implicit Steganography Beyond the Constraints of Modality Sojeong Song, Seoyun Yang, Chang D. Yoo, Junmo Kim
PDF
Implicit Style-Content Separation Using B-LoRA Yarden Frenkel, Yael Vinker, Ariel Shamir, Danny Cohen-Or
PDF
Improving 2D Feature Representations by 3D-Aware Fine-Tuning Yuanwen Yue, Anurag Das, Francis Engelmann, Siyu Tang, Jan Eric Lenssen
PDF
Improving 3D Semi-Supervised Learning by Effectively Utilizing All Unlabelled Data Sneha Paul, Zachary Patterson, Nizar Bouguila
PDF
Improving Adversarial Transferability via Model Alignment Avery Ma, Amir-massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu
PDF
Improving Agent Behaviors with RL Fine-Tuning for Autonomous Driving Zhenghao Peng, Wenjie Luo, Yiren Lu, Tianyi Shen, Cole Gulino, Ari Seff, Justin Fu
PDF
Improving Diffusion Models for Authentic Virtual Try-on in the Wild Yisol Choi, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin
PDF
Improving Domain Generalization in Self-Supervised Monocular Depth Estimation via Stabilized Adversarial Training Yuanqi Yao, Gang Wu, Kui Jiang, Siao Liu, Jian Kuai, Xianming Liu, Junjun Jiang
PDF
Improving Feature Stability During Upsampling -- Spectral Artifacts and the Importance of Spatial Context Shashank Agnihotri, Julia Grabinski, Margret Keuper
PDF
Improving Geo-Diversity of Generated Images with Contextualized Vendi Score Guidance Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun, Candace Ross, Michal Drozdzal, Adriana Romero-Soriano
PDF
Improving Hyperbolic Representations via Gromov-Wasserstein Regularization Yifei Yang, Wonjun Lee, Dongmian Zou, Gilad Lerman
PDF
Improving Image Synthesis with Diffusion-Negative Sampling Alakh Desai, Nuno Vasconcelos
PDF
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models Nishad Singhi, Jae Myung Kim, Karsten Roth, Zeynep Akata
PDF
Improving Knowledge Distillation via Regularizing Feature Direction and Norm Yuzhu Wang, Lechao Cheng, Manni Duan, Yongheng Wang, Zunlei Feng, Shu Kong
PDF
Improving Medical Multi-Modal Contrastive Learning with Expert Annotations Yogesh Kumar, Pekka Marttinen
PDF
Improving Neural Surface Reconstruction with Feature Priors from Multi-View Images Xinlin Ren, Chenjie Cao, Yanwei Fu, Xiangyang Xue
PDF
Improving Point-Based Crowd Counting and Localization Based on Auxiliary Point Guidance I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu, Ming-Hsuan Yang, Sy-Yen Kuo
PDF
Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures Sayanton V. Dibbo, Adam Breuer, Juston Moore, Michael Teti
PDF
Improving Text-Guided Object Inpainting with Semantic Pre-Inpainting Yifu Chen, Jingwen Chen, Yingwei Pan, Yehao Li, Ting Yao, Zhineng Chen, Tao Mei
PDF
Improving Unsupervised Domain Adaptation: A Pseudo-Candidate Set Approach Aveen Dayal, Rishabh Lalla, Linga Reddy Cenkeramaddi, C. Krishna Mohan, Abhinav Kumar, Vineeth N Balasubramanian
PDF
Improving Video Segmentation via Dynamic Anchor Queries Yikang Zhou, Tao Zhang, Xiangtai Li, Shunping Ji, Shuicheng Yan
PDF
Improving Virtual Try-on with Garment-Focused Diffusion Models Siqi Wan, Yehao Li, Jingwen Chen, Yingwei Pan, Ting Yao, Yang Cao, Tao Mei
PDF
Improving Vision and Language Concepts Understanding with Multimodal Counterfactual Samples Chengen Lai, Shengli Song, Sitong Yan, Guangneng Hu
PDF
Improving Zero-Shot Generalization for CLIP with Variational Adapter Ziqian Lu, Fengli Shen, Mushui Liu, Yunlong Yu, Xi Li
PDF
Improving Zero-Shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation Marco Mistretta, Alberto Baldrati, Marco Bertini, Andrew D. Bagdanov
PDF
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation Dahyun Kang, Minsu Cho
PDF
iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning Tom Fischer, Yaoyao Liu, Artur Jesslen, Noor Ahmed, Prakhar Kaushik, Angtian Wang, Alan Yuille, Adam Kortylewski, Eddy Ilg
PDF
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer. Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang
PDF
Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li
PDF
InfMAE: A Foundation Model in the Infrared Modality Fangcen Liu, Chenqiang Gao, Yaming Zhang, Junjie Guo, Jinghao Wang, Deyu Meng
PDF
InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction Xulong Wang, Siyan Dong, Youyi Zheng, Yanchao Yang
PDF
Information Bottleneck Based Data Correction in Continual Learning Shuai Chen, Mingyi Zhang, Junge Zhang, Kaiqi Huang
PDF
Insect Identification in the Wild: The AMI Dataset Aditya Jain, Fagner Cunha, Michael J Bunsen, Juan Sebastián Cañas, Léonard Pasi, Nathan Pinoy, Flemming Helsing, JoAnne Russo, Marc S Botham, Michael Sabourin, Jonathan Fréchette, Alexandre Anctil, Yacksecari Lopez, Eduardo Navarro, Filonila Pérez, Ana C Zamora, Jose Alejandro Ramirez-Silva, Jonathan Gagnon, Tom A August, Kim Bjerge, Alba Gomez Segura, Marc Belisle, Yves Basset, Kent P McFarland, David B Roy, Toke T Høye, Maxim Larrivee, David Rolnick
PDF
InsMapper: Exploring Inner-Instance Information for Vectorized HD Mapping Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang Zhao
PDF
Instance-Dependent Noisy-Label Learning with Graphical Model Based Noise-Rate Estimation Arpit Garg, Cuong Cao Nguyen, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro
PDF
Instant 3D Human Avatar Generation Using Image Diffusion Models Nikos Kolotouros, Thiemo Alldieck, Enric Corona, Eduard Gabriel Bazavan, Cristian Sminchisescu
PDF
Instant Uncertainty Calibration of NeRFs Using a Meta-Calibrator Niki Amini-Naieni, Tomas Jakab, Andrea Vedaldi, Ronald Clark
PDF
InstaStyle: Inversion Noise of a Stylized Image Is Secretly a Style Adviser Xing Cui, Zekun Li, Peipei Li, Huaibo Huang, Xuannan Liu, Zhaofeng He
PDF
InstructGIE: Towards Generalizable Image Editing Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang
PDF
Instruction Tuning-Free Visual Token Complement for Multimodal LLMs Dongsheng Wang, Jiequan Cui, Miaoge Li, Wang Lin, Bo Chen, Hanwang Zhang
PDF
InstructIR: High-Quality Image Restoration Following Human Instructions Marcos V. Conde, Gregor Geigle, Radu Timofte
PDF
Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-Performance and Energy-Efficient Object Detection Xinhao Luo, Man Yao, Yuhong Chou, Bo Xu, Guoqi Li
PDF
Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization Naiyu Yin, Hanjing Wang, Yue Yu, Tian Gao, Amit Dhurandhar, Qiang Ji
PDF
Integration of Global and Local Representations for Fine-Grained Cross-Modal Alignment Seungwan Jin, Hoyoung Choi, Taehyung Noh, Kyungsik Han
PDF
Inter-Class Topology Alignment for Efficient Black-Box Substitute Attacks Lingzhuang Meng, Mingwen Shao, Yuanjian Qiao, Wenjie Liu
PDF
Interaction-Centric Spatio-Temporal Context Reasoning for Multi-Person Video HOI Recognition Yisong Wang, Nan Xi, Jingjing Meng, Junsong Yuan
PDF
Interactive 3D Object Detection with Prompts Ruifei Zhang, Xiangru Lin, Wei Zhang, Jincheng Lu, Xuekuan Wang, Xiao Tan, Yingying Li, Errui Ding, Jingdong Wang, Guanbin Li
PDF
InterFusion: Text-Driven Generation of 3D Human-Object Interaction Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu, Ruizhen Hu
PDF
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection Yongwei Nie, Hao Huang, Chengjiang Long, Qing Zhang, Pradipta Maji, Hongmin Cai
PDF
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, SongZe Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang
PDF
Interpretability-Guided Test-Time Adversarial Defense Akshay Kulkarni, Tsui-Wei Weng
PDF
INTRA: Interaction Relationship-Aware Weakly Supervised Affordance Grounding Ji Ha Jang, Hoigi Seo, Se Young Chun
PDF
Intrinsic Single-Image HDR Reconstruction Sebastian Dille, Chris Careaga, Yagiz Aksoy
PDF
IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination Xi Chen, Sida Peng, Dongchen Yang, Yuan Liu, Bowen Pan, Chengfei Lyu, Xiaowei Zhou
PDF
Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Tingyu Qu, Tinne Tuytelaars, Marie-Francine Moens
PDF
Invertible Neural Warp for NeRF Shin-Fang Chng, Ravi Garg, Hemanth Saratchandran, Simon Lucey
PDF
Investigating Style Similarity in Diffusion Models Gowthami Somepalli, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas A. Geiping, Abhinav Shrivastava, Tom Goldstein
PDF
IRGen: Generative Modeling for Image Retrieval Yidan Zhang, Ting Zhang, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Jingdong Wang, Baining Guo
PDF
IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection Mingjin Zhang, Yuchun Wang, Jie Guo, Yunsong Li, Xinbo Gao, Jing Zhang
PDF
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-of-Distribution Images Jacopo Bonato, Marco Cotogni, Luigi Sabetta
PDF
Is User Feedback Always Informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation Without Source Data Junha Song, Tae Soo Kim, Junha Kim, Gunhee Nam, Thijs Kooi, Jaegul Choo
PDF
Isomorphic Pruning for Vision Models Gongfan Fang, Xinyin Ma, Michael Bi Mi, Xinchao Wang
PDF
Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models Xiao Liu, Xiaoliu Guan, Yu Wu, Jiaxu Miao
PDF
ItTakesTwo: Leveraging Peer Representations for Semi-Supervised LiDAR Semantic Segmentation Yuyuan Liu, Yuanhong Chen, Hu Wang, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro
PDF
IVTP: Instruction-Guided Visual Token Pruning for Large Vision-Language Models Kai Huang, Hao Zou, Ye Xi, Bochen Wang, Zhen Xie, Liang Yu
PDF
JDT3D: Addressing the Gaps in LiDAR-Based Tracking-by-Attention Brian Cheong, Jiachen Zhou, Steven L Waslander
PDF
Joint RGB-Spectral Decomposition Model Guided Image Enhancement in Mobile Photography Kailai Zhou, Lijing Cai, Yibo Wang, Mengya Zhang, Bihan Wen, Qiu Shen, Xun Cao
PDF
JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation ChenHan Jiang, Yihan Zeng, Tianyang Hu, Songcen Xu, Wei Zhang, Hang Xu, Dit-Yan Yeung
PDF
Just a Hint: Point-Supervised Camouflaged Object Detection Huafeng Chen, Dian Shao, Guangqian Guo, Shan Gao
PDF
Kalman-Inspired Feature Propagation for Video Face Super-Resolution Ruicheng Feng, Chongyi Li, Chen Change Loy
PDF
KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval Xianwei Zhuang, Hongxiang Li, Xuxin Cheng, Zhihong Zhu, Yuxin Xie, Yuexian Zou
PDF
Kernel Diffusion: An Alternate Approach to Blind Deconvolution Yash Sanghvi, Yiheng Chi, Stanley Chan
PDF
Keypoint Promptable Re-Identification Vladimir Somers, Alexandre Alahi, Christophe De Vleeschouwer
PDF
KeypointDETR: An End-to-End 3D Keypoint Detector Hairong Jin, Yuefan Shen, Jianwen Lou, Kun Zhou, Youyi Zheng
PDF
KFD-NeRF: Rethinking Dynamic NeRF with Kalman Filter Yifan Zhan, Zhuoxiao Li, Muyao Niu, Zhihang Zhong, Shohei Nobuhara, Ko Nishino, Yinqiang Zheng
PDF
Kinetic Typography Diffusion Model Seonmi Park, Inhwan Bae, Seunghyun Shin, Hae-Gon Jeon
PDF
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding Zhihao Xu, Shengjie Gong, Jiapeng Tang, Lingyu Liang, Yining Huang, Haojie Li, Shuangping Huang
PDF
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation Tao Chen, Xiruo Jiang, Gensheng Pei, Zeren Sun, Yucheng Wang, Yazhou Yao
PDF
Knowledge-Enhanced Visual-Language Pretraining for Computational Pathology Xiao Zhou, Xiaoman Zhang, Chaoyi Wu, Ya Zhang, Weidi Xie, Yan-Feng Wang
PDF
L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model Yuchen Hong, Haofeng Zhong, Shuchen Weng, Jinxiu S Liang, Boxin Shi
PDF
Label-Anticipated Event Disentanglement for Audio-Visual Video Parsing Jinxing Zhou, Dan Guo, Yuxin Mao, Yiran Zhong, Xiaojun Chang, Meng Wang
PDF
Label-Free Neural Semantic Image Synthesis Jiayi Wang, Kevin A Laube, Yumeng Li, Jan Hendrik Metzen, Shin-I Cheng, Julio Borges, Anna Khoreva
PDF
LabelDistill: Label-Guided Cross-Modal Knowledge Distillation for Camera-Based 3D Object Detection Sanmin Kim, Youngseok Kim, Sihwan Hwang, Hyeonjun Jeong, Dongsuk Kum
PDF
Labeled Data Selection for Category Discovery Bingchen Zhao, Nico Lang, Serge Belongie, Oisin Mac Aodha
PDF
Lagrangian Hashing for Compressed Neural Field Representations Shrisudhan Govindarajan, Zeno Sambugaro, Akhmedkhan Shabanov, Towaki Takikawa, Weiwei Sun, Daniel Rebain, Nicola Conci, Kwang Moo Yi, Andrea Tagliasacchi
PDF
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction Penghui Du, Yu Wang, Yifan Sun, Luting Wang, Yue Liao, Gang Zhang, Errui Ding, Yan Wang, Jingdong Wang, Si Liu
PDF
Lane Graph as Path: Continuity-Preserving Path-Wise Modeling for Online Lane Graph Construction Bencheng Liao, Shaoyu Chen, Bo Jiang, Tianheng Cheng, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang
PDF
Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation Haoyu Ji, Bowen Chen, Xinglong Xu, Weihong Ren, Zhiyong Wang, Honghai Liu
PDF
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang, An Dinh Vuong, Quan Vuong, Ngan Le, Thieu Vo, Anh Nguyen
PDF
Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting Ri-Zhao Qiu, Ge Yang, Weijia Zeng, Xiaolong Wang
PDF
Language-Image Pre-Training with Long Captions Kecheng Zheng, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, Xin Jin, Wei Chen, Yujun Shen
PDF
LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation Ruida Zhang, Ziqin Huang, Gu Wang, Chenyangguang Zhang, Yan Di, Xingxing Zuo, Jiwen Tang, Xiangyang Ji
PDF
LAPT: Label-Driven Automated Prompt Tuning for OOD Detection with Vision-Language Models Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang
PDF
LaRa: Efficient Large-Baseline Radiance Fields Anpei Chen, Haofei Xu, Stefano Esposito, Siyu Tang, Andreas Geiger
PDF
Large Motion Model for Unified Multi-Modal Motion Generation Mingyuan Zhang, Daisheng Jin, Chenyang Gu, Fangzhou Hong, Zhongang Cai, Jingfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu
PDF
Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps Jordão Bragantini, Merlin Lange, Loïc A Royer
PDF
Large-Scale Reinforcement Learning for Diffusion Models Yinan Zhang, Eric Tzeng, Yilun Du, Dmitry Kislyuk
PDF
LASS3D: Language-Assisted Semi-Supervised 3D Semantic Segmentation with Progressive Unreliable Data Exploitation Jianan Li, Qiulei Dong
PDF
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging Zongliang Wu, Ruiying Lu, Ying Fu, Xin Yuan
PDF
Latent Guard: A Safety Framework for Text-to-Image Generation Runtao Liu, Ashkan Khakzar, Jindong Gu, Qifeng Chen, Philip Torr, Fabio Pizzati
PDF
Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics Shishira R Maiya, Anubhav Gupta, Matthew A Gwilliam, Max Ehrlich, Abhinav Shrivastava
PDF
LatentEditor: Text Driven Local Editing of 3D Scenes Umar Khalid, Hasan Iqbal, Muhammad Tayyab, Md Nazmul Karim, Jing Hua, Chen Chen
PDF
latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, Jan E. Lenssen
PDF
LATTE3D: Large-Scale Amortized Text-to-Enhanced3D Synthesis Kevin Xie, Tianshi Cao, Jonathan P Lorraine, Jun Gao, James R Lucas, Antonio Torralba, Sanja Fidler, Xiaohui Zeng
PDF
LaWa: Using Latent Space for In-Generation Image Watermarking Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar, Arezou Fatemi, Yong Zhang
PDF
Layer-Wise Relevance Propagation with Conservation Property for ResNet Seitaro Otsuki, Tsumugi Iida, Félix Doublet, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura
PDF
LayerDiff: Exploring Text-Guided Multi-Layered Composable Image Synthesis via Layer-Collaborative Diffusion Model Runhui Huang, Kaixin Cai, Jianhua Han, Xiaodan Liang, Renjing Pei, Guansong Lu, Songcen Xu, Wei Zhang, Hang Xu
PDF
Layered Rendering Diffusion Model for Controllable Zero-Shot Image Synthesis Zipeng Qi, Guoxi Huang, Chenyang Liu, Fei Ye
PDF
LayeredFlow: A Real-World Benchmark for Non-Lambertian Multi-Layer Optical Flow Hongyu Wen, Erich Liang, Jia Deng
PDF
Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model Shoma Iwai, Atsuki Osanai, Shunsuke Kitada, Shinichiro Omachi
PDF
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer Ning Yu, Chia-chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul W Josel, Juan Carlos Niebles, Caiming Xiong, Ran Xu
PDF
LayoutFlow: Flow Matching for Layout Generation Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui, Mayu Otani, Hideki Nakayama
PDF
Lazy Diffusion Transformer for Interactive Image Editing Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Danny Cohen-Or, Taesung Park, Michaël Gharbi
PDF
LCM-Lookahead for Encoder-Based Text-to-Image Personalization Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit Bermano, Gal Chechik, Danny Cohen-Or
PDF
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence Mengyao Lyu, Tianxiang Hao, Xinhao Xu, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
PDF
Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAM Baicheng Li, Zike Yan, Dong Wu, Hanqing Jiang, Hongbin Zha
PDF
Learn to Optimize Denoising Scores: A Unified and Improved Diffusion Prior for 3D Generation Xiaofeng Yang, Yiwen Chen, Cheng Chen, Chi Zhang, Yi Xu, Xulei Yang, Fayao Liu, Guosheng Lin
PDF
Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization Jiajun Hu, Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao
PDF
Learned HDR Image Compression for Perceptually Optimal Storage and Display Peibei Cao, Haoyu Chen, Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie, Haiqing Bai, Kede Ma
PDF
Learned Image Enhancement via Color Naming David Serrano-Lozano, Luis Herranz, Michael S Brown, Javier Vazquez-Corral
PDF
Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction Misha Andriluka, Baruch Tabanpour, Daniel Freeman, Cristian Sminchisescu
PDF
Learned Rate Control for Frame-Level Adaptive Neural Video Compression via Dynamic Neural Network Chenhao Zhang, Wei Gao
PDF
Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal Yuxin Wang, Qianyi Wu, Guofeng Zhang, Dan Xu
PDF
Learning 3D-Aware GANs from Unposed Images with Template Feature Field Xinya Chen, Hanlei Guo, Yanrui Bin, Shangzhan Zhang, Yuanbo Yang, Yujun Shen, Yue Wang, Yiyi Liao
PDF
Learning a Dynamic Privacy-Preserving Camera Robust to Inversion Attacks Jiacheng Cheng, Xiang Dai, Jia Wan, Nick Antipa, Nuno Vasconcelos
PDF
Learning Anomalies with Normality Prior for Unsupervised Video Anomaly Detection Haoyue Shi, Le Wang, Sanping Zhou, Gang Hua, Wei Tang
PDF
Learning by Aligning 2D Skeleton Sequences and Multi-Modality Fusion Quoc-Huy Tran, Muhammad Ahmed, Murad Popattia, Muhammad Hassan Ahmed, Andrey Konin, Zeeshan Zia
PDF
Learning Camouflaged Object Detection from Noisy Pseudo Label Jin Zhang, Ruiheng Zhang, Yanjiao Shi, Zhe Cao, Nian Liu, Fahad Shahbaz Khan
PDF
Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning Yifeng Zhang, Ming Jiang, Qi Zhao
PDF
Learning Cross-Hand Policies of High-DOF Reaching and Grasping Qijin She, Shishun Zhang, Yunfan Ye, Ruizhen Hu, Kai Xu
PDF
Learning Differentially Private Diffusion Models via Stochastic Adversarial Distillation Bochao Liu, Pengju Wang, Shiming Ge
PDF
Learning Diffusion Models for Multi-View Anomaly Detection Chieh Liu, Yu-Min Chu, Ting-I Hsieh, Hwann-Tzong Chen, Tyng-Luh Liu
PDF
Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution Zhiheng Li, Muheng Li, Jixuan Fan, Lei Chen, Yansong Tang, Jiwen Lu, Jie Zhou
PDF
Learning Equilibrium Transformation for Gamut Expansion and Color Restoration Jun Xiao, Changjian Shui, Zhi-Song Liu, Qian Ye, Kin-Man Lam
PDF
Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence Hongyuan Wang, Lizhi Wang, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan
PDF
Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation Chang Liu, Giulia Rizzoli, Pietro Zanuttigh, Fu Li, Yi Niu
PDF
Learning High-Resolution Vector Representation from Multi-Camera Images for 3D Object Detection Zhili Chen, Shuangjie Xu, Maosheng Ye, Zian Qian, Xiaoyi Zou, Dit-Yan Yeung, Qifeng Chen
PDF
Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes Chao Chen, Yu-Shen Liu, Zhizhong Han
PDF
Learning Modality-Agnostic Representation for Semantic Segmentation from Any Modalities Xu Zheng, Yuanhuiyi Lyu, Lin Wang
PDF
Learning Multimodal Latent Generative Models with Energy-Based Prior Shiyu Yuan, Jiali Cui, Hanao Li, Tian Han
PDF
Learning Natural Consistency Representation for Face Forgery Video Detection Daichi Zhang, Zihao Xiao, Shikun Li, Fanzhao Lin, Jianmin Li, Shiming Ge
PDF
Learning Neural Deformation Representation for 4D Dynamic Shape Generation Gyojin Han, Jiwan Hur, Jaehyun Choi, Junmo Kim
PDF
Learning Neural Volumetric Pose Features for Camera Localization Jingyu Lin, Jiaqi Gu, Bojian Wu, Lubin Fan, Renjie Chen, Ligang Liu, Jieping Ye
PDF
Learning Non-Linear Invariants for Unsupervised Out-of-Distribution Detection Lars Doorenbos, Raphael Sznitman, Pablo Márquez Neila
PDF
Learning Pseudo 3D Guidance for View-Consistent Texturing with 2D Diffusion Kehan Li, Yanbo Fan, Yang Wu, Zhongqian Sun, Wei Yang, Xiangyang Ji, Li Yuan, Jie Chen
PDF
Learning Quantized Adaptive Conditions for Diffusion Models Yuchen Liang, Yuchuan Tian, Lei Yu, Huaao Tang, Jie Hu, Xiangzhong Fang, Hanting Chen
PDF
Learning Representation for Multitask Learning Through Self-Supervised Auxiliary Learning Seokwon Shin, Hyungrok Do, Youngdoo Son
PDF
Learning Representations from Foundation Models for Domain Generalized Stereo Matching Yongjian Zhang, Longguang Wang, Kunhong Li, Wang Yun, Yulan Guo
PDF
Learning Representations of Satellite Images from Metadata Supervision Jules Bourcier, Gohar Dashyan, Karteek Alahari, Jocelyn Chanussot
PDF
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy Tao Li, Weisen Jiang, Fanghui Liu, Xiaolin Huang, James Kwok
PDF
Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction Guowei Xu, Jiale Tao, Wen Li, Lixin Duan
PDF
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooi
PDF
Learning to Adapt SAM for Segmenting Cross-Domain Point Clouds Xidong Peng, Runnan Chen, Feng Qiao, Lingdong Kong, Youquan Liu, Yujing Sun, Tai Wang, Xinge Zhu, Yuexin Ma
PDF
Learning to Build by Building Your Own Instructions Aaron T Walsman, Muru Zhang, Adam Fishman, Ali Farhadi, Dieter Fox
PDF
Learning to Complement and to Defer to Multiple Users Zheng Zhang, Wenjie Ai, Kevin Wells, David M Rosewarne, Thanh-Toan Do, Gustavo Carneiro
PDF
Learning to Detect Multi-Class Anomalies with Just One Normal Image Prompt Bin-Bin Gao
PDF
Learning to Distinguish Samples for Generalized Category Discovery Fengxiang Yang, Nan Pu, Wenjing Li, Zhiming Luo, Shaozi Li, Nicu Sebe, Zhun Zhong
PDF
Learning to Drive via Asymmetric Self-Play Chris Zhang, Sourav Biswas, Kelvin Wong, Kion Fallah, Lunjun Zhang, Dian Chen, Sergio Casas, Raquel Urtasun
PDF
Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging In Cho, Hyunbo Shim, Seon Joo Kim
PDF
Learning to Generate Conditional Tri-Plane for 3D-Aware Expression Controllable Portrait Animation Taekyung Ki, Dongchan Min, Gyeongsu Chae
PDF
Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment Yuxiao Chen, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas
PDF
Learning to Make Keypoints Sub-Pixel Accurate Shinjeong Kim, Marc Pollefeys, Daniel Barath
PDF
Learning to Obstruct Few-Shot Image Classification over Restricted Classes Amber Yijia Zheng, Chiao-An Yang, Raymond A. Yeh
PDF
Learning to Robustly Reconstruct Dynamic Scenes from Low-Light Spike Streams Liwen Hu, Ziluo Ding, Mianzhi Liu, Lei Ma, Tiejun Huang
PDF
Learning to Unlearn for Robust Machine Unlearning Mark He Huang, Lin Geng Foo, Jun Liu
PDF
Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality Kyu Ri Park, Hong Joo Lee, Jung Uk Kim
PDF
Learning Unified Reference Representation for Unsupervised Multi-Class Anomaly Detection Liren He, Zhengkai Jiang, Jinlong Peng, Wenbing Zhu, Liang Liu, Qiangang Du, Xiaobin Hu, Mingmin Chi, Yabiao Wang, Chengjie Wang
PDF
Learning Unsigned Distance Functions from Multi-View Images with Volume Rendering Priors Wenyuan Zhang, Kanle Shi, Yu-Shen Liu, Zhizhong Han
PDF
Learning Video Context as Interleaved Multimodal Sequences Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou
PDF
Learning Where to Look: Self-Supervised Viewpoint Selection for Active Localization Using Geometrical Information Luca Di Giammarino, Boyang Sun, Giorgio Grisetti, Marc Pollefeys, Hermann Blum, Daniel Barath
PDF
Learning with Counterfactual Explanations for Radiology Report Generation Mingjie Li, Haokun Lin, Liang Qiu, Xiaodan Liang, Ling Chen, Abdulmotaleb Elsaddik, Xiaojun Chang
PDF
Learning with Unmasked Tokens Drives Stronger Vision Learners Taekyung Kim, Sanghyuk Chun, Byeongho Heo, Dongyoon Han
PDF
Learning-Based Axial Video Motion Magnification Kwon Byung-Ki, Oh Hyun-Bin, Kim Jun-Seong, Hyunwoo Ha, Tae-Hyun Oh
PDF
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M Rehg, Miao Liu
PDF
Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion Models Saman Motamed, Danda Pani Paudel, Luc Van Gool
PDF
LEIA: Latent View-Invariant Embeddings for Implicit 3D Articulation Archana Swaminathan, Anubhav Gupta, Kamal Gupta, Shishira R Maiya, Vatsal Agarwal, Abhinav Shrivastava
PDF
Length-Aware Motion Synthesis via Latent Diffusion Alessio Sampieri, Alessio Palma, Indro Spinelli, Fabio Galasso
PDF
LEROjD: LiDAR Extended Radar-Only Object Detection Patrick Palmer, Martin Krüger, Stefan Schütte, Richard Altendorfer, Ganesh Adam, Torsten Bertram
PDF
Let the Avatar Talk Using Texts Without Paired Training Data Xiuzhe Wu, Yang-Tian Sun, Handi Chen, Hang Zhou, Jingdong Wang, Zhengzhe Liu, Xiaojuan Qi
PDF
LetsMap: Unsupervised Representation Learning for Label-Efficient Semantic BEV Mapping Nikhil Gosala, Kürsat Petek, B Ravi Kiran, Senthil Yogamani, Paulo L. J. Drews-Jr, Wolfram Burgard, Abhinav Valada
PDF
Leveraging Enhanced Queries of Point Sets for Vectorized mAP Construction Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao, Ningyi Xu
PDF
Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation Haizhong Zheng, Jiachen Sun, Shutong Wu, Bhavya Kailkhura, Zhuoqing Morley Mao, Chaowei Xiao, Atul Prakash
PDF
Leveraging Imperfect Restoration for Data Availability Attack Yi Huang, Jeremy Styborski, Mingzhi Lyu, Fan Wang, Wai-Kin Adams Kong
PDF
Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen Pizer, Marc Niethammer, Roni Sengupta
PDF
Leveraging Representations from Intermediate Encoder-Blocks for Synthetic Image Detection Christos Koutlis, Symeon Papadopoulos
PDF
Leveraging Scale- and Orientation-Covariant Features for Planar Motion Estimation Marcus Valtonen Örnhag, Alberto Jaenal
PDF
Leveraging Temporal Contextualization for Video Action Recognition Minji Kim, Dongyoon Han, Taekyung Kim, Bohyung Han
PDF
Leveraging Text Localization for Scene Text Removal via Text-Aware Masked Image Modeling Zixiao Wang, Hongtao Xie, YuXin Wang, Yadong Qu, Fengjun Guo, Pengwei Liu
PDF
Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions Jiacong Xu, Mingqian Liao, Ram Prabhakar Kathirvel, Vishal Patel
PDF
LG-Gaze: Learning Geometry-Aware Continuous Prompts for Language-Guided Gaze Estimation Pengwei Yin, Jingjing Wang, Guanzhong Zeng, Di Xie, Jiang Zhu
PDF
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu
PDF
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model Dilxat Muhtar, Zhenshi Li, Feng Gu, Xueliang Zhang, Pengfeng Xiao
PDF
LiDAR-Based All-Weather 3D Object Detection via Prompting and Distilling 4D Radar Yujeong Chae, Hyeonseong Kim, Changgyoon Oh, Minseok Kim, Kuk-Jin Yoon
PDF
LiDAR-Event Stereo Fusion with Hallucinations Luca Bartolomei, Matteo Poggi, Andrea Conti, Stefano Mattoccia
PDF
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors Saksham Suri, Matthew Walmer, Kamal Gupta, Abhinav Shrivastava
PDF
Light-in-Flight for a World-in-Motion Jongho Lee, Ryan J Suess, Mohit Gupta
PDF
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu
PDF
Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and Manipulation Sehyung Lee, Mijung Kim, Yeongnam Chae, Bjorn Stenger
PDF
LineFit: A Geometric Approach for Fitting Line Segments in Images Marion Boyer, David Youssefi, Florent Lafarge
PDF
LingoQA: Video Question Answering for Autonomous Driving Ana-Maria Marcu, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski
PDF
Linking in Style: Understanding Learned Features in Deep Learning Models Maren Wehrheim, Pamela Osuna Vargas, Matthias Kaschube
PDF
LISO: LiDAR-Only Self-Supervised 3D Object Detection Stefan Andreas Baur, Frank Moosmann, Andreas Geiger
PDF
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu, James M Rehg
PDF
LITA: Language Instructed Temporal-Localization Assistant De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz
PDF
LiteSAM Is Actually What You Need for Segment Everything Jianhai Fu, Yuanjie Yu, Ningchuan Li, Yi Zhang, Qichao Chen, Jianping Xiong, Jun Yin, Zhiyu Xiang
PDF
LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun, Yuexin Ma
PDF
LivePhoto: Real Image Animation with Text-Guided Motion Control Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao
PDF
Llama-VID: An Image Is Worth 2 Tokens in Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
PDF
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang
PDF
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li
PDF
LLaVA-UHD: An LMM Perceiving Any Aspect Ratio and High-Resolution Images Zonghao Guo, Ruyi Xu, Yuan Yao, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Gao Huang
PDF
LLM as Copilot for Coarse-Grained Vision-and-Language Navigation Yanyuan Qiao, Qianyi Liu, Jiajun Liu, Jing Liu, Qi Wu
PDF
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model Yulin Luo, Ruichuan An, Bocheng Zou, Yiming Tang, Jiaming Liu, Shanghang Zhang
PDF
LLMCO4MR: LLMs-Aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang Yuqing Zhang, Hangqi Li, Shengyu Zhang, Runzhong Wang, Baoyi He, Huaiyong Dou, Junchi Yan, Yongquan Zhang, Fei Wu
PDF
LLMGA: Multimodal Large Language Model Based Generation Assistant Bin Xia, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia
PDF
LMT-GP: Combined Latent Mean-Teacher and Gaussian Process for Semi-Supervised Low-Light Image Enhancement Ye Yu, Fengxin Chen, Jun Yu, Zhen Kan
PDF
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation Yushi Lan, Fangzhou Hong, Shuai Yang, Shangchen Zhou, Xuyi Meng, Bo Dai, Xingang Pan, Chen Change Loy
PDF
LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration Siqi Wang, Bryan Plummer
PDF
LoA-Trans: Enhancing Visual Grounding by Location-Aware Transformers Ziling Huang, Shin'ichi Satoh
PDF
Loc3Diff: Local Diffusion for 3D Human Head Synthesis and Editing Yushi Lan, Feitong Tan, Qiangeng Xu, Di Qiu, Kyle Genova, Zeng Huang, Rohit Pandey, Sean Fanello, Thomas Funkhouser, Chen Change Loy, Yinda Zhang
PDF
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
PDF
Local All-Pair Correspondence for Point Tracking Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim, Joon-Young Lee
PDF
Local and Global Flatness for Federated Domain Generalization Hao Yan, Yuhong Guo
PDF
Local Occupancy-Enhanced Object Grasping with Multiple Triplanar Projection Kangqi Ma, Hao Dong, Yadong Mu
PDF
Localization and Expansion: A Decoupled Framework for Point Cloud Few-Shot Semantic Segmentation Zhaoyang Li, Yuan Wang, Wangkai Li, Rui Sun, Tianzhu Zhang
PDF
LogoSticker: Inserting Logos into Diffusion Models for Customized Generation Mingkang Zhu, Xi Chen, Zhongdao Wang, Hengshuang Zhao, Jiaya Jia
PDF
Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Jiaqi Wang
PDF
Long-Range Turbulence Mitigation: A Large-Scale Dataset and a Coarse-to-Fine Framework Shengqi Xu, Run Sun, Yi Chang, Shuning Cao, Xueyao Xiao, Luxin Yan
PDF
Long-Tail Temporal Action Segmentation with Group-Wise Temporal Logit Adjustment Zhanzhong Pang, Fadime Sener, Shrinivas Ramasubramanian, Angela Yao
PDF
Long-Term Temporal Context Gathering for Neural Video Compression Linfeng Qi, Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, Yan Lu
PDF
LongVLM: Efficient Long Video Understanding via Large Language Models Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang
PDF
Look Around and Learn: Self-Training Object Detection by Exploration Gianluca Scarpellini, Stefano Rosa, Pietro Morerio, Lorenzo Natale, Alessio Del Bue
PDF
Look Hear: Gaze Prediction for Speech-Directed Human Attention Sounak Mondal, Seoyoung Ahn, Zhibo Yang, Niranjan Balasubramanian, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
PDF
LookupViT: Compressing Visual Information to a Limited Number of Tokens Rajat Koner, Gagan Jain, Sujoy Paul, Volker Tresp, Prateek Jain
PDF
Lossy Image Compression with Foundation Diffusion Models Lucas Relic, Roberto Azevedo, Markus Gross, Christopher Schroers
PDF
Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking Lorenzo Vaquero, Yihong Xu, Xavier Alameda-Pineda, Victor M. Brea, Manuel Mucientes
PDF
Lost in Translation: Latent Concept Misalignment in Text-to-Image Diffusion Models Juntu Zhao, Junyu Deng, Yixin Ye, Chongxuan Li, Zhijie Deng, Dequan Wang
PDF
Lost in Translation: Modern Neural Networks Still Struggle with Small Realistic Image Transformations Ofir Shifman, Yair Weiss
PDF
LPViT: Low-Power Semi-Structured Pruning for Vision Transformers Kaixin Xu, Zhe Wang, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin
PDF
LRSLAM: Low-Rank Representation of Signed Distance Fields in Dense Visual SLAM System Hongbeen Park, Minjeong Park, Giljoo Nam, Jinkyu Kim
PDF
M&m’s: A Benchmark to Evaluate Tool-Use for Multi-Step Multi-Modal Tasks Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
PDF
M^2Depth: Self-Supervised Two-Frame Multi-Camera Metric Depth Estimation Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang
PDF
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models Seunggeun Chi, Hyung-gun Chi, Hengbo Ma, Nakul Agarwal, Faizan Siddiqui, Karthik Ramani, Kwonjoon Lee
PDF
M3DBench: Towards Omni 3D Assistant with Interleaved Multi-Modal Instructions Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Zhuoyuan Li, Gang Yu, Tao Chen
PDF
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion Lehong Wu, Lilang Lin, Jiahang Zhang, Yiyang Ma, Jiaying Liu
PDF
MAD-DR: mAP Compression for Visual Localization with Matchness Aware Descriptor Dimension Reduction Qiang Wang
PDF
Made to Order: Discovering Monotonic Temporal Changes via Self-Supervised Video Ordering Charig Yang, Weidi Xie, Andrew Zisserman
PDF
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing Haoyu Zhao, Tianyi Lu, Jiaxi Gu, Xing Zhang, Qingping Zheng, Zuxuan Wu, Hang Xu, Yu-Gang Jiang
PDF
MagicEraser: Erasing Any Objects via Semantics-Aware Control Fan Li, Zixiao Zhang, Yi Huang, Jianzhuang Liu, Renjing Pei, Bin Shao, Songcen Xu
PDF
MagicMirror: Fast and High-Quality Avatar Generation with Constrained Search Space Armand Comas, Di Qiu, Menglei Chai, Marcel C. Bühler, Amit Raj, Ruiqi Gao, Qiangeng Xu, Mark J Matthews, Paulo Gotardo, Sergio Orts-Escolano, Thabo Beeler
PDF
MagMax: Leveraging Model Merging for Seamless Continual Learning Daniel Marczak, Bartlomiej Twardowski, Tomasz Trzcinski, Sebastian Cygert
PDF
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment Kanglei Zhou, Liyuan Wang, Xingxing Zhang, Hubert P. H. Shum, Frederick W. B. Li, Jianguo Li, Xiaohui Liang
PDF
Mahalanobis Distance-Based Multi-View Optimal Transport for Multi-View Crowd Localization Qi Zhang, Kaiyi Zhang, Antoni B. Chan, Hui Huang
PDF
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen
PDF
Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation Shoumeng Qiu, Jie Chen, Xinrun Li, Ru Wan, Xiangyang Xue, Jian Pu
PDF
Make Your ViT-Based Multi-View 3D Detectors Faster via Token Compression Dingyuan Zhang, Dingkang Liang, Zichang Tan, Xiaoqing Ye, Cheng Zhang, Jingdong Wang, Xiang Bai
PDF
Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation Fangfu Liu, Hanyang Wang, Weiliang Chen, Haowen Sun, Yueqi Duan
PDF
Making Large Language Models Better Planners with Reasoning-Decision Alignment Zhijian Huang, Tao Tang, Shaoxiang Chen, Sihao Lin, Zequn Jie, Lin Ma, Guangrun Wang, Xiaodan Liang
PDF
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data Shufan Li, Aditya Grover, Harkanwar Singh
PDF
MambaIR: A Simple Baseline for Image Restoration with State-Space Model Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia
PDF
ManiGaussian: Dynamic Gaussian Splatting for Multi-Task Robotic Manipulation Guanxing Lu, Shiyi Zhang, Ziwei Wang, Changliu Liu, Jiwen Lu, Yansong Tang
PDF
MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation Jiaxi Jiang, Paul Streli, Xuejing Luo, Christoph Gebhardt, Christian Holz
PDF
MAP-ADAPT: Real-Time Quality-Adaptive Semantic 3D Maps Jianhao Zheng, Daniel Barath, Marc Pollefeys, Iro Armeni
PDF
MapDistill: Boosting Efficient Camera-Based HD mAP Construction via Camera-LiDAR Fusion Model Distillation Xiaoshuai Hao, Ruikai Li, Hui Zhang, Rong Yin, Dingzhe Li, Sangil Jung, Seung-In Park, ByungIn Yoo, Haimei Zhao, Jing Zhang
PDF
MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping Jiacheng Chen, Yuefan Wu, Jiaqi Tan, Hang Ma, Yasutaka Furukawa
PDF
MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description Ziqiang Zheng, Yiwei Chen, Huimin Zeng, Tuan-Anh Vu, Binh-Son Hua, Sai-Kit Yeung
PDF
MaRINeR: Enhancing Novel Views by Matching Rendered Images with Nearby References Lukas Bösiger, Mihai Dusmanu, Marc Pollefeys, Zuria Bauer
PDF
Markov Knowledge Distillation: Make Nasty Teachers Trained by Self-Undermining Knowledge Distillation Fully Distillable En-hui Yang, Linfeng Ye
PDF
MARs: Multi-View Attention Regularizations for Patch-Based Feature Recognition of Space Terrain Timothy Chase Jr, Karthik Dantu
PDF
MART: MultiscAle Relational Transformer Networks for Multi-Agent Trajectory Prediction Seongju Lee, Junseok Lee, Yeonguk Yu, Taeri Kim, Kyoobin Lee
PDF
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection Kuo Wang, Lechao Cheng, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li
PDF
Mask as Supervision: Leveraging Unified Mask Information for Unsupervised 3D Pose Estimation Yuchen Yang, Yu Qiao, Xiao Sun
PDF
Mask2Map: Vectorized HD mAP Construction Using Bird's Eye View Segmentation Masks Sehwan Choi, Jun Won Choi, Jungho Kim, Hongjae Shin
PDF
Masked Angle-Aware Autoencoder for Remote Sensing Images Zhihao Li, Biao Hou, Siteng Ma, Zitong Wu, Xianpeng Guo, Bo Ren, Licheng Jiao
PDF
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity Santiago Pascual, Chunghsin Yeh, Ioannis Tsiamas, Joan Serrà
PDF
Masked Motion Prediction with Semantic Contrast for Point Cloud Sequence Learning Yuehui Han, Can Xu, Rui Xu, Jianjun Qian, Jin Xie
PDF
Masked Video and Body-Worn IMU Autoencoder for Egocentric Action Recognition Mingfang Zhang, Yifei Huang, Ruicong Liu, Yoichi Sato
PDF
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo
PDF
Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching Junpeng Jing, Ye Mao, Krystian Mikolajczyk
PDF
MathVerse: Does Your Multi-Modal LLM Truly See the Diagrams in Visual Math Problems? Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Peng Gao, Hongsheng Li
PDF
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models Nithin Gopalakrishnan Nair, Jeya Maria Jose Valanarasu, Vishal Patel
PDF
MaxMI: A Maximal Mutual Information Criterion for Manipulation Concept Discovery Pei Zhou, Yanchao Yang
PDF
MC-PanDA: Mask Confidence for Panoptic Domain Adaptation Ivan Martinović, Josip Šarić, Siniša Šegvić
PDF
McGrids: Monte Carlo-Driven Adaptive Grids for Iso-Surface Extraction Daxuan Ren, Hezi Shi, Jianmin Zheng, Jianfei Cai
PDF
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks Elad Hirsch, Gefen Dawidowicz, Ayellet Tal
PDF
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Jun Chen, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha
PDF
MegaScenes: Scene-Level View Synthesis at Scale Joseph Tung, Gene Chou, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, Noah Snavely
PDF
MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory Juwon Kang, Nayeong Kim, Jungseul Ok, Suha Kwak
PDF
Memory-Efficient Fine-Tuning for Quantized Diffusion Model Hyogon Ryu, Seohyun Lim, Hyunjung Shim
PDF
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara
PDF
Merlin: Empowering Multimodal LLMs with Foresight Minds En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao
PDF
MERLiN: Single-Shot Material Estimation and Relighting for Photometric Stereo Ashish Tiwari, Satoshi Ikehata, Shanmuganathan Raman
PDF
Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation Yujin Chen, Yinyu Nie, Benjamin Ummenhofer, Reiner Birkl, Michael Paulitsch, Matthias Müller, Matthias Niessner
PDF
MeshAvatar: Learning High-Quality Triangular Human Avatars from Multi-View Videos Yushuo Chen, Zerong Zheng, Zhe Li, Chao Xu, Yebin Liu
PDF
MeshFeat: Multi-Resolution Features for Neural Fields on Meshes Mihir Mahajan, Florian Hofherr, Daniel Cremers
PDF
MeshSegmenter: Zero-Shot Mesh Segmentation via Texture Synthesis Ziming Zhong, Yanyu Xu, Jing Li, Jiale Xu, Zhengxin Li, Chaohui Yu, Shenghua Gao
PDF
MeshVPR: Citywide Visual Place Recognition Using 3D Meshes Gabriele Berton, Lorenz Junglas, Riccardo Zaccone, Thomas Pollok, Barbara Caputo, Carlo Masone
PDF
MesonGS: Post-Training Compression of 3D Gaussians via Efficient Attribute Transformation Shuzhao Xie, Weixiang Zhang, Chen Tang, Yunpeng Bai, Rongwei Lu, Shjia Ge, Zhi Wang
PDF
Meta-Optimized Angular Margin Contrastive Framework for Video-Language Representation Learning Thong Thanh Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi M Le, Cong-Duy Nguyen, See Kiong Ng, Anh Tuan Luu
PDF
Meta-Prompting for Automating Zero-Shot Visual Recognition with LLMs Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Sivan Doveh, Jakub Micorek, Mateusz Kozinski, Hilde Kuehne, Horst Possegger
PDF
MetaAT: Active Testing for Label-Efficient Evaluation of Dense Recognition Tasks Sanbao Su, Xin Li, Thang Doan, Sima Behpour, Wenbin He, Liang Gou, Fei Miao, Liu Ren
PDF
MetaAug: Meta-Data Augmentation for Post-Training Quantization Cuong Van Pham, Hoang Anh Dung, Cuong Cao Nguyen, Trung Le, Dinh Phung, Gustavo Carneiro, Thanh-Toan Do
PDF
MetaCap: Meta-Learning Priors from Multi-View Imagery for Sparse-View Human Performance Capture and Rendering Guoxing Sun, Rishabh Dabral, Pascal Fua, Christian Theobalt, Marc Habermann
PDF
MetaWeather: Few-Shot Weather-Degraded Image Restoration Youngrae Kim, Younggeol Cho, Thanh-Tung Nguyen, Seunghoon Hong, Dongman Lee
PDF
MEVG : Multi-Event Video Generation with Text-to-Video Models Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Sangpil Kim
PDF
Mew: Multiplexed Immunofluorescence Image Analysis Through an Efficient Multiplex Network Sukwon Yun, Jie Peng, Alexandro E Trevino, Chanyoung Park, Tianlong Chen
PDF
MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation Linyan Yang, Lukas Hoyer, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Daniel Cremers, Marc Pollefeys, Luc Van Gool
PDF
MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition Aggelina Chatziagapi, Grigorios Chrysos, Dimitris Samaras
PDF
milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing Fangqiang Ding, Zhen Luo, Peijun Zhao, Chris Xiaoxuan Lu
PDF
Mind the Interference: Retaining Pre-Trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models Longxiang Tang, Zhuotao Tian, Kai Li, Chunming He, Hantao Zhou, Hengshuang Zhao, Xiu Li, Jiaya Jia
PDF
MinD-3D: Reconstruct High-Quality 3D Objects in Human Brain Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu
PDF
Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians Guangchi Fang, Bing Wang
PDF
Minimalist Vision with Freeform Pixels Jeremy Klotz, Shree Nayar
PDF
MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections Jiayue Liu, Xiao Tang, Freeman Cheng, Zihao Yang, Zhihao Li, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan
PDF
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment Brian Gordon, Yonatan Bitton, Yonatan Shafir, Roopal Garg, Xi Chen, Dani Lischinski, Daniel Cohen-Or, Idan Szpektor
PDF
Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models Taesup Kim, Donggeun Kim
PDF
Mitigating Background Shift in Class-Incremental Semantic Segmentation Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, Jae-Pil Heo
PDF
Mitigating Perspective Distortion-Induced Shape Ambiguity in Image Crops Aditya Prakash, Arjun Gupta, Saurabh Gupta
PDF
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization Tianchen Zhao, Xuefei Ning, Tongcheng Fang, Enshu Liu, Guyue Huang, Zinan Lin, Shengen Yan, Guohao Dai, Yu Wang
PDF
Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection Alireza Ganjdanesh, Yan Kang, Yuchen Liu, Richard Zhang, Zhe Lin, Heng Huang
PDF
ML-SemReg: Boosting Point Cloud Registration with Multi-Level Semantic Consistency Shaocheng Yan, Pengcheng Shi, Jiayuan Li
PDF
MLPHand: Real Time Multi-View 3D Hand Reconstruction via MLP Modeling Jian Yang, Jiakun Li, Guoming Li, Huaiyu Wu, Zhen Shen, Zhaoxin Fan
PDF
MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models Xin Liu, Yichen Zhu, Jindong Gu, Yunshi Lan, Chao Yang, Yu Qiao
PDF
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-Training Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Samuel Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Futang Peng, Anton Belyi, Max A Schwarzer, Hongyu Hè, Xianzhi Du, Haotian Zhang, Karanjeet Singh, Doug Kang, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang
PDF
MMBENCH: Is Your Multi-Modal Model an All-Around Player? Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin
PDF
MMEarth: Exploring Multi-Modal Pretext Tasks for Geospatial Representation Learning Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke, Serge Belongie, Christian Igel, Nico Lang
PDF
MMVR: Millimeter-Wave Multi-View Radar Dataset and Benchmark for Indoor Perception Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato, Pu Wang, Peizhao Li, Adriano Cardace, Petros Boufounos
PDF
MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets Peng Liao, Xilu Wang, Yaochu Jin, Wenli Du
PDF
MoAI: Mixture of All Intelligence for Large Language and Vision Models Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro
PDF
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices Yang Zhao, Zhisheng Xiao, Yanwu Xu, Haolin Jia, Tingbo Hou
PDF
MobileNetV4: Universal Models for the Mobile Ecosystem Danfeng Qin, Chas H Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard
PDF
Möbius Transform for Mitigating Perspective Distortions in Representation Learning Prakash Chandra Chhipa, Meenakshi Subhash Chippa, Kanjar De, Rajkumar Saini, Marcus Liwicki, Mubarak Shah
PDF
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos Yihong Sun, Bharath Hariharan
PDF
Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge Heitor Rapela Medeiros, Masih Aminbeidokhti, Fidel A Guerrero Pena, David Latortue, Eric Granger, Marco Pedersoli
PDF
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks MohammadReza Davari, Eugene Belilovsky
PDF
Model Stock: All We Need Is Just a Few Fine-Tuned Models Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han
PDF
Modeling and Driving Human Body Soundfields Through Acoustic Primitives Chao Huang, Dejan Markovic, Chenliang Xu, Alexander Richard
PDF
Modeling Label Correlations with Latent Context for Multi-Label Recognition Zhaomin Chen, Quan Cui, Ruoxi Deng, Jie Hu, Guodao Zhang
PDF
Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model Guanren Qiao, Guiliang Liu, Guorui Quan, Rongxiao Qu
PDF
MoE-DiffIR: Task-Customized Diffusion Priors for Universal Compressed Image Restoration Yulin Ren, Xin Li, Bingchen Li, Xingrui Wang, Mengxi China Guo, Shijie Zhao, Li Zhang, Zhibo Chen
PDF
MoEAD: A Parameter-Efficient Model for Multi-Class Anomaly Detection Shiyuan Meng, Wenchao Meng, Qihang Zhou, Shizhong Li, Weiye Hou, Shibo He
PDF
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng
PDF
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang
PDF
Momentum Auxiliary Network for Supervised Local Learning Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan, Chenyang Si
PDF
Mono-ViFI: A Unified Learning Framework for Self-Supervised Single- and Multi-Frame Monocular Depth Estimation Jinfeng Liu, Lingtong Kong, Bo Li, Zerong Wang, Hong Gu, Jinwei Chen
PDF
Monocular Occupancy Prediction for Scalable Indoor Scenes Hongxiao Yu, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang
PDF
MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection Hongbin Lin, Yifan Zhang, Shuaicheng Niu, Shuguang Cui, Zhen Li
PDF
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection Youngmin Oh, Hyung-Il Kim, Seong Tae Kim, Jung Uk Kim
PDF
MONTAGE: Monitoring Training for Attribution of Generative Diffusion Models Jonathan Brokman, Omer Hofman, Roman Vainshtein, Amit Giloni, Toshiya Shimizu, Inderjeet Singh, Oren Rachmil, Alon Zolfi, Asaf Shabtai, Yuki Unno, Hisashi Kojima
PDF
Motion and Structure from Event-Based Normal Flow Zhongyang Ren, Bangyan Liao, Delei Kong, Jinghang Li, Peidong Liu, Laurent Kneip, Guillermo Gallego, Yi Zhou
PDF
Motion Aware Event Representation-Driven Image Deblurring Zhijing Sun, Xueyang Fu, Longzhuo Huang, Aiping Liu, Zheng-Jun Zha
PDF
Motion Keyframe Interpolation for Any Human Skeleton Using Point Cloud-Based Human Motion Data Homogenisation Clinton A Mo, Kun Hu, Chengjiang Long, Dong Yuan, Zhiyong Wang
PDF
Motion Mamba: Efficient and Long Sequence Motion Generation Zeyu Zhang, Akide Liu, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang
PDF
Motion-Guided Latent Diffusion for Temporally Consistent Real-World Video Super-Resolution Xi Yang, Chenhang He, Jianqi Ma, Lei Zhang
PDF
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling Jaehyeok Kim, Dongyoon Wee, Dan Xu
PDF
Motion-Prior Contrast Maximization for Dense Continuous-Time Motion Estimation Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis, Kenneth Chaney, Guillermo Gallego, Kostas Daniilidis
PDF
MotionChain: Conversational Motion Controllers via Multimodal Prompts Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang Yu, Jiayuan Fan
PDF
MotionDirector: Motion Customization of Text-to-Video Diffusion Models Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou
PDF
MotionLCM: Real-Time Controllable Motion Generation via Latent Consistency Model Wenxun Dai, Ling-Hao Chen, Jingbo Wang, Jinpeng Liu, Bo Dai, Yansong Tang
PDF
MoVideo: Motion-Aware Video Generation with Diffusion Models Jingyun Liang, Yuchen Fan, Kai Zhang, Radu Timofte, Luc Van Gool, Rakesh Ranjan
PDF
MRSP: Learn Multi-Representations of Single Primitive for Compositional Zero-Shot Learning Dongyao Jiang, Hui Chen, Haodong Jing, Yongqiang Ma, Nanning Zheng
PDF
MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn, Yuntae Jeon, Michael Franzen, Matthias Standfest, Jan van Gemert, Seyran Khademi
PDF
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment Anurag Das, Xinting Hu, Li Jiang, Bernt Schiele
PDF
MTaDCS: Moving Trace and Feature Density-Based Confidence Sample Selection Under Label Noise Qingzheng Huang, Xilin He, Xiaole Xian, Qinliang Lin, Weicheng Xie, Siyang Song, Linlin Shen, Zitong Yu
PDF
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution Yuxuan Jiang, Chen Feng, Fan Zhang, David Bull
PDF
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders Baijiong Lin, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, Yingcong Chen
PDF
Multi-Branch Collaborative Learning Network for 3D Visual Grounding Zhipeng Qian, Yiwei Ma, Zhekai Lin, Jiayi Ji, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji
PDF
Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation Lei Wang, Zejian Yuan, Badong Chen
PDF
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot Fabien Baradel, Thomas Lucas, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Gregory Rogez
PDF
Multi-Label Cluster Discrimination for Visual Representation Learning Xiang An, Kaicheng Yang, Xiangzi Dai, Ziyong Feng, Jiankang Deng
PDF
Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification Jiangming Shi, Xiangbo Yin, Yeyun Chen, Yachao Zhang, Zhizhong Zhang, Yuan Xie, Yanyun Qu
PDF
Multi-Modal Crowd Counting via a Broker Modality Haoliang Meng, Xiaopeng Hong, Chenhao Wang, Miao Shang, Wangmeng Zuo
PDF
Multi-Modal Relation Distillation for Unified 3D Representation Learning Huiqun Wang, Yiping Bao, Panwang Pan, Zeming Li, Xiao Liu, Ruijie Yang, Di Huang
PDF
Multi-Modal Video Dialog State Tracking in the Wild Adnen Abdessaied, Lei Shi, Andreas Bulling
PDF
Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning Peng Xiao, Yi Xie, Xuemiao Xu, Weihong Chen, Huaidong Zhang
PDF
Multi-RoI Human Mesh Recovery with Camera Consistency and Contrastive Losses Yongwei Nie, Changzhen Liu, Chengjiang Long, Qing Zhang, Guiqing Li, Hongmin Cai
PDF
Multi-Scale Cross Distillation for Object Detection in Aerial Images Kun Wang, Zi Wang, Zhang Li, Xichao Teng, Yang Li
PDF
Multi-Sentence Grounding for Long-Term Instructional Video Zeqian Li, Qirui Chen, Tengda Han, Ya Zhang, Yan-Feng Wang, Weidi Xie
PDF
Multi-Task Domain Adaptation for Language Grounding with 3D Objects Penglei Sun, Yaoxian Song, Xinglin Pan, Peijie Dong, Xiaofei Yang, Qiang Wang, Zhixu Li, Tiefeng Li, Xiaowen Chu
PDF
MultiDelete for Multimodal Machine Unlearning Jiali Cheng, Hadi Amiri
PDF
MultiGen: Zero-Shot Image Generation from Multi-Modal Prompts Zhi-Fan Wu, Lianghua Huang, Wei Wang, Yanheng Wei, Yu Liu
PDF
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition Masashi Hatano, Ryo Hachiuma, Ryo Fujii, Hideo Saito
PDF
Multimodal Label Relevance Ranking via Reinforcement Learning Taian Guo, Taolin Zhang, Haoqian Wu, Hanjun Li, Ruizhi Qiao, Xing Sun
PDF
Multiscale Graph Texture Network Ravishankar Evani, Deepu Rajan, Shangbo Mao
PDF
Multiscale Sliced Wasserstein Distances as Perceptual Color Difference Measures Jiaqi He, Zhihua Wang, Leon Wang, Tsein-I Liu, Yuming Fang, Qilin Sun, Kede Ma
PDF
Multistain Pretraining for Slide Representation Learning in Pathology Guillaume Jaume, Anurag J Vaidya, Andrew Zhang, Andrew Song, Richard J Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long P Le, Faisal Mahmood
PDF
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving Under Uncertainty Tim Broedermann, David Brüggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc Van Gool
PDF
MutDet: Mutually Optimizing Pre-Training for Remote Sensing Object Detection Ziyue Huang, Yongchao Feng, Qingjie Liu, Yunhong Wang
PDF
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-Driven Diffusion Jian Ma, Wenguan Wang, Yi Yang, Feng Zheng
PDF
MVDD: Multi-View Depth Diffusion Models Zhen Wang, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang
PDF
MVDiffHD: A Dense High-Resolution Multi-View Diffusion Model for Single or Sparse-View 3D Object Reconstruction Shitao Tang, Jiacheng Chen, Dilin Wang, Chengzhou Tang, Fuyang Zhang, Yuchen Fan, Vikas Chandra, Yasutaka Furukawa, Rakesh Ranjan
PDF
MVPGS: Excavating Multi-View Priors for Gaussian Splatting from Sparse Input Views Wangze Xu, Huachen Gao, Shihe Shen, Rui Peng, Jianbo Jiao, Ronggang Wang
PDF
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo Tianqi Liu, Guangcong Wang, Shoukang Hu, Liao Shen, Xinyi Ye, Yuhang Zang, Zhiguo Cao, Wei Li, Ziwei Liu
PDF
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
PDF
MyVLM: Personalizing VLMs for User-Specific Queries Yuval Alaluf, Elad Richardson, Sergey Tulyakov, Kfir Aberman, Danny Cohen-Or
PDF
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields Yash Bhalgat, Iro Laina, Joao F Henriques, Andrew Zisserman, Andrea Vedaldi
PDF
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition Chenyu Liu, Jia Pan, Jinshui Hu, Baocai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu
PDF
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models Gengze Zhou, Yicong Hong, Zun Wang, Xin Eric Wang, Qi Wu
PDF
Navigating Text-to-Image Generative Bias Across Indic Languages Surbhi Mittal, Arnav Sudan, Mayank Vatsa, Richa Singh, Tamar Glaser, Tal Hassner
PDF
Navigation Instruction Generation with BEV Perception and Large Language Models Sheng Fan, Rui Liu, Wenguan Wang, Yi Yang
PDF
NePhi: Neural Deformation Fields for Approximately Diffeomorphic Medical Image Registration Lin Tian, Thomas H Greer, Raul San Jose Estepar, Roni Sengupta, Marc Niethammer
PDF
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus
PDF
NeRF-XL: NeRF at Any Scale with Multi-GPU Ruilong Li, Sanja Fidler, Angjoo Kanazawa, Francis Williams
PDF
NeRMo: Learning Implicit Neural Representations for 3D Human Motion Prediction Dong Wei, Huaijiang Sun, Xiaoning Sun, Shengxiang Hu
PDF
Neural Graphics Texture Compression Supporting Random Access Farzad Farhadzadeh, Qiqi Hou, Hoang Le, Amir Said, Randall R Rauwendaal, Alex Bourd, Fatih Porikli
PDF
Neural Metamorphosis Xingyi Yang, Xinchao Wang
PDF
Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending Delong Wu, Hao Zhu, Qi Zhang, You Li, Xun Cao, Zhan Ma
PDF
Neural Spectral Decomposition for Dataset Distillation Shaolei Yang, Shen Cheng, Mingbo Hong, Haoqiang Fan, Xing Wei, Shuaicheng Liu
PDF
Neural Surface Detection for Unsigned Distance Fields Federico Stella, Nicolas Talabot, Hieu Le, Pascal Fua
PDF
Neural Volumetric World Models for Autonomous Driving Zanming Huang, Jimuyang Zhang, Eshed Ohn-Bar
PDF
NeuroNCAP: Photorealistic Closed-Loop Safety Testing for Autonomous Driving William Ljungbergh, Adam Tonderski, Joakim Johnander, Holger Caesar, Kalle Åström, Michael Felsberg, Christoffer Petersson
PDF
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-Individual Pretraining and Multi-Level Modulation Jingyang Huo, Yikai Wang, Yanwei Fu, Xuelin Qian, Chong Li, Yun Wang, Jianfeng Feng
PDF
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji
PDF
NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis Yubin Hu, Xiaoyang Guo, Yang Xiao, Jingwei Huang, Yong-Jin Liu
PDF
Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation Sangyeop Yeo, Yoojin Jang, Jaejun Yoo
PDF
NICP: Neural ICP for 3D Human Registration at Scale Riccardo Marin, Enric Corona, Gerard Pons-Moll
PDF
NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model Zhongqun Zhang, Hengfei Wang, Ziwei Yu, Yihua Cheng, Angela Yao, Hyung Jin Chang
PDF
Noise Calibration: Plug-and-Play Content-Preserving Video Enhancement Using Pre-Trained Video Diffusion Models Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan
PDF
Noise-Assisted Prompt Learning for Image Forgery Detection and Localization Dong Li, Jiaying Zhu, Xueyang Fu, Xun Guo, Yidi Liu, Gang Yang, Jiawei Liu, Zheng-Jun Zha
PDF
Non-Exemplar Domain Incremental Learning via Cross-Domain Concept Integration Qiang Wang, Yuhang He, Songlin Dong, Xinyuan Gao, Shaokun Wang, Yihong Gong
PDF
Non-Line-of-Sight Estimation of Fast Human Motion with Slow Scanning Imagers Javier Grau Chopite, Patrick Hähn, Matthias B Hullin
PDF
Non-Parametric Sensor Noise Modeling and Synthesis Ali Mosleh, Luxi Zhao, Atin Vikram Singh, Jaeduk Han, Abhijith Punnappurath, Marcus A Brubaker, Jihwan Choe, Michael S Brown
PDF
Non-Transferable Pruning Ruyi Ding, Lili Su, A. Adam Ding, Yunsi Fei
PDF
Nonverbal Interaction Detection Jianan Wei, Tianfei Zhou, Yi Yang, Wenguan Wang
PDF
Norface: Improving Facial Expression Analysis by Identity Normalization Hanwei Liu, Rudong An, Zhimeng Zhang, Bowen Ma, Wei Zhang, Yan Song, Yujing Hu, Chen Wei, Yu Ding
PDF
Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification Yu Bai, Bo Zhang, Zheng Zhang, Shuo Yan, Zibo Ma, Wu Liu, Xiuzhuang Zhou, Xiangyang Gong, Wendong Wang
PDF
Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data Yuxuan Li, Sarthak Kumar Maharana, Yunhui Guo
PDF
NOVUM: Neural Object Volumes for Robust Object Classification Artur Jesslen, Guofeng Zhang, Angtian Wang, Wufei Ma, Alan Yuille, Adam Kortylewski
PDF
nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding Benjin Zhu, Zhe Wang, Hongsheng Li
PDF
Nuvo: Neural UV Mapping for Unruly 3D Representations Pratul Srinivasan, Stephan J Garbin, Dor Verbin, Jonathan T Barron, Ben Mildenhall
PDF
NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, Doyup Lee
PDF
Nymeria: A Massive Collection of Egocentric Multi-Modal Human Motion in the Wild Lingni Ma, Yuting Ye, Rowan Postyeni, Alexander J Gamino, Vijay Baiyya, Luis Pesqueira, Kevin M Bailey, David Soriano Fosas, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Hyo Jin Kim, Jakob Engel, Karen Liu, Ziwei Liu, Renzo De Nardi, Richard Newcombe
PDF
O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation Muer Tie, Julong Wei, Zhengjun Wang, Ke Wu, Shanshuai Yuan, Kaizhao Zhang, Jie Jia, Jieru Zhao, Zhongxue Gan, Wenchao Ding
PDF
OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal Qiao Mo, Yukang Ding, Jinhua Hao, Qiang Zhu, Ming Sun, Chao Zhou, Feiyu Chen, Shuyuan Zhu
PDF
OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction Yini Fang, Jingling Yu, Haozheng Zhang, Ralf van der Lans, Bertram E Shi
PDF
Object-Aware NIR-to-Visible Translation Yunyi Gao, Lin Gu, Qiankun Liu, Ying Fu
PDF
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval Naoya Sogi, Takashi Shibata, Makoto Terao
PDF
Object-Centric Diffusion for Efficient Video Editing Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M Asano, Amirhossein Habibian
PDF
Object-Conditioned Energy-Based Attention mAP Alignment in Text-to-Image Diffusion Models Yasi Zhang, Peiyu Yu, Ying Nian Wu
PDF
Object-Oriented Anchoring and Modal Alignment in Multimodal Learning Shibin Mei, Bingbing Ni, Hang Wang, Chenglong Zhao, Fengfa Hu, Zhiming Pi, BiLian Ke
PDF
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion Daniel Winter, Matan Cohen, Shlomi Fruchter, Yael Pritch, Alex Rav-Acha, Yedid Hoshen
PDF
OccGen: Generative Multi-Modal 3D Occupancy Prediction for Autonomous Driving Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma
PDF
Occluded Gait Recognition with Mixture of Experts: An Action Detection Perspective Panjian Huang, Yunjie Peng, Saihui Hou, Chunshui Cao, Xu Liu, Zhiqiang He, Yongzhen Huang
PDF
Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding Niloofar Azizi, Mohsen Fayyaz, Horst Bischof
PDF
Occlusion-Aware Seamless Segmentation Yihong Cao, Jiaming Zhang, Hao Shi, Kunyu Peng, Yuhongxuan Zhang, Hui Zhang, Rainer Stiefelhagen, Kailun Yang
PDF
Occupancy as Set of Points Yiang Shi, Tianheng Cheng, Qian Zhang, Wenyu Liu, Xinggang Wang
PDF
OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving Wenzhao Zheng, Weiliang Chen, Yuanhui Huang, Borui Zhang, Yueqi Duan, Jiwen Lu
PDF
Octopus: Embodied Vision-Language Programmer from Environmental Feedback Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang, ChenCheng Jiang, Haoran Tan, Jiamu Kang, Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
PDF
OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations Yiming Zuo, Jia Deng
PDF
OLAF: A Plug-and-Play Framework for Enhanced Multi-Object Multi-Part Scene Parsing Pranav Gupta, Rishubh Singh, Pradeep Shenoy, Ravi Kiran Sarvadevabhatla
PDF
OMG: Occlusion-Friendly Personalized Multi-Concept Generation in Diffusion Models Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, Wenhan Luo
PDF
Omni-Recon: Harnessing Image-Based Rendering for General-Purpose Neural Radiance Fields Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan Lin
PDF
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6d Object Pose Estimation Mengchen Zhang, Tong Wu, Tai Wang, Tengfei Wang, Ziwei Liu, Dahua Lin
PDF
Omni6DPose: A Benchmark and Model for Universal 6d Object Pose Estimation and Tracking Jiyao Zhang, Weiyao Huang, Bo Peng, Mingdong Wu, Fei Hu, Zijian Chen, Bo Zhao, Hao Dong
PDF
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web Raghav Kapoor, Yash Parag Butala, Melisa A Russak, Jing Yu Koh, Kiran Kamble, Waseem AlShikh, Ruslan Salakhutdinov
PDF
OmniNOCS: A Unified NOCS Dataset and Model for 3D Lifting of 2D Objects Akshay Krishnan, Abhijit Kundu, Kevis-Kokitsi Maninis, James Hays, Matthew Brown
PDF
OmniSat: Self-Supervised Modality Fusion for Earth Observation Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
PDF
OmniSSR: Zero-Shot Omnidirectional Image Super-Resolution Using Stable Diffusion Model Runyi Li, Xuhan Sheng, Weiqi Li, Jian Zhang
PDF
Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-Training Models Shouwei Ruan, Yinpeng Dong, Liu Hanqing, Yao Huang, Hang Su, Xingxing Wei
PDF
OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection Dongkwon Jin, Chang-Su Kim
PDF
On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines Selim Kuzucu, Kemal Oksuz, Jonathan Sadeghi, Puneet Dokania
PDF
On Learning Discriminative Features from Synthesized Data for Self-Supervised Fine-Grained Visual Recognition Zihu Wang, Lingqiao Liu, Scott Ricardo Figueroa Weston, Samuel Tian, Peng Li
PDF
On Pretraining Data Diversity for Self-Supervised Learning Hasan Abed Al Kader Hammoud, Tuhin Das, Fabio Pizzati, Philip Torr, Adel Bibi, Bernard Ghanem
PDF
On Spectral Properties of Gradient-Based Explanation Methods Amir Mehrpanah, Erik Englesson, Hossein Azizpour
PDF
On the Approximation Risk of Few-Shot Class-Incremental Learning Xuan Wang, Zhong Ji, Xiyao Liu, Yanwei Pang, Jungong Han
PDF
On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy Letian Huang, Jiayang Bai, Jie Guo, Yuanqi Li, Yanwen Guo
PDF
On the Evaluation Consistency of Attribution-Based Explanations Jiarui Duan, Haoling Li, Haofei Zhang, Hao Jiang, Mengqi Xue, Li Sun, Mingli Song, Jie Song
PDF
On the Topology Awareness and Generalization Performance of Graph Neural Networks Junwei Su, Chuan Wu
PDF
On the Utility of 3D Hand Poses for Action Recognition Md Salman Shamil, Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao
PDF
On the Viability of Monocular Depth Pre-Training for Semantic Segmentation Dong Lao, Fengyu Yang, Daniel Wang, Hyoungseob Park, Samuel Lu, Alex Wong, Stefano Soatto
PDF
On the Vulnerability of Skip Connections to Model Inversion Attacks Jun Hao Koh, Sy-Tuyen Ho, Ngoc-Bao Nguyen, Ngai-Man Cheung
PDF
On-the-Fly Category Discovery for LiDAR Semantic Segmentation Hyeonseong Kim, Sung-Hoon Yoon, Minseok Kim, Kuk-Jin Yoon
PDF
One-Shot Diffusion Mimicker for Handwritten Text Generation Gang Dai, Yifan Zhang, Quhui Ke, Qiangya Guo, Shuangping Huang
PDF
One-Stage Prompt-Based Continual Learning Youngeun Kim, Yuhang Li, Priyadarshini Panda
PDF
OneRestore: A Universal Restoration Framework for Composite Degradation Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Wen Liu, Shengfeng He
PDF
OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers Qitai Wang, Jiawei He, Yuntao Chen, Zhaoxiang Zhang
PDF
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei Zhang, Wenqiang Zhang
PDF
Online Continuous Generalized Category Discovery Keon-Hee Park, Hakyung Lee, Kyungwoo Song, Gyeong-Moon Park
PDF
Online Temporal Action Localization with Memory-Augmented Transformer Youngkil Song, Dongkeun Kim, Minsu Cho, Suha Kwak
PDF
Online Vectorized HD mAP Construction Using Geometry Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding, Fusheng Jin, Xiangyu Yue
PDF
Online Video Quality Enhancement with Spatial-Temporal Look-up Tables Zefan Qu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Cairong Zhao
PDF
Online Zero-Shot Classification with CLIP Qi Qian, Juhua Hu
PDF
OP-Align: Object-Level and Part-Level Alignment for Self-Supervised Category-Level Articulated Object Pose Estimation Yuchen Che, Ryo Furukawa, Asako Kanezaki
PDF
Open Panoramic Segmentation Junwei Zheng, Ruiping Liu, Yufan Chen, Kunyu Peng, Chengzhi Wu, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
PDF
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation Pengfei Wang, Yuxi Wang, Shuai Li, Zhaoxiang Zhang, Zhen Lei, Lei Zhang
PDF
Open Vocabulary Multi-Label Video Classification Rohit Gupta, Mamshad Nayeem Rizve, Jayakrishnan Unnikrishnan, Ashish Tawari, Son Tran, Mubarak Shah, Benjamin Yao, Trishul A Chilimbi
PDF
Open-Set Biometrics: Beyond Good Closed-Set Models Yiyang Su, Minchul Kim, Feng Liu, Anil Jain, Xiaoming Liu
PDF
Open-Set Domain Adaptation via Joint Error Based Multi-Class Positive and Unlabeled Learning Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada
PDF
Open-Set Recognition in the Age of Vision-Language Models Dimity Miller, Niko Suenderhauf, Alex Kenna, Keita Mason
PDF
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models Xiaoyu Zhu, Hao Zhou, Pengfei Xing, Long Zhao, Hao Xu, Junwei Liang, Alexander G. Hauptmann, Ting Liu, Andrew Gallagher
PDF
Open-Vocabulary Camouflaged Object Segmentation Youwei Pang, Xiaoqi Zhao, JiaMing Zuo, Lihe Zhang, Huchuan Lu
PDF
Open-Vocabulary RGB-Thermal Semantic Segmentation GuoQiang Zhao, JunJie Huang, Xiaoyun Yan, Zhaojing Wang, Junwei Tang, Yangjun Ou, Xinrong Hu, Tao Peng
PDF
Open-Vocabulary SAM: Segment and Recognize Twenty-Thousand Classes Interactively Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, Chen Change Loy
PDF
Open-World Dynamic Prompt and Continual Visual Representation Learning Youngeun Kim, Jun Fang, Qin Zhang, Zhaowei Cai, Yantao Shen, Rahul Duggal, Dripta S. Raychaudhuri, Zhuowen Tu, Yifan Xing, Onkar Dabeer
PDF
OPEN: Object-Wise Position Embedding for Multi-View 3D Object Detection Jinghua Hou, Tong Wang, Xiaoqing Ye, Zhe Liu, Shi Gong, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai
PDF
OpenIns3D: Snap and Lookup for 3D Open-Vocabulary Instance Segmentation Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
PDF
OpenKD: Opening Prompt Diversity for Zero- and Few-Shot Keypoint Detection Changsheng Lu, Zheyuan Liu, Piotr Koniusz
PDF
OpenPSG: Open-Set Panoptic Scene Graph Generation via Large Multimodal Models Zijian Zhou, Zheng Zhu, Holger Caesar, Miaojing Shi
PDF
OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection Hu Zhang, Xu Jianhua, Tao Tang, Haiyang Sun, Xin Yu, Zi Helen Huang, Kaicheng Yu
PDF
Operational Open-Set Recognition and PostMax Refinement Steve Cruz, Ryan Rabinowitz, Manuel Günther, Terrance E. Boult
PDF
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding Ming Hu, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, Zhongxing Xu, Yimin Luo, Kaimin Song, Jurgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kaijing Zhou, Zongyuan Ge
PDF
Optimal Transport of Diverse Unsupervised Tasks for Robust Learning from Noisy Few-Shot Data Xiaofan Que, Qi Yu
PDF
Optimization-Based Uncertainty Attribution via Learning Informative Perturbations Hanjing Wang, Bashirul Azam Biswas, Qiang Ji
PDF
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation Yixiao Wang, Chen Tang, Lingfeng Sun, Simone Rossi, Yichen Xie, Chensheng Peng, Thomas Hannagan, Stefano Sabatini, Nicola Poerio, Masayoshi Tomizuka, Wei Zhan
PDF
Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition Shreyank N Gowda, Anurag Arnab, Jonathan Huang
PDF
Optimizing Illuminant Estimation in Dual-Exposure HDR Imaging Mahmoud Afifi, Zhenhua Hu, Liang Liang
PDF
Osmosis: RGBD Diffusion Prior for Underwater Image Restoration Opher Bar Nathan, Deborah Levy, Tali Treibitz, Dan Rosenbaum
PDF
OTSeg: Multi-Prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation Kwanyoung Kim, Yujin Oh, Jong Chul Ye
PDF
Oulu Remote-Photoplethysmography Physical Domain Attacks Database (ORPDAD) Marko Savic, Guoying Zhao
PDF
Out-of-Bounding-Box Triggers: A Stealthy Approach to Cheat Object Detectors Tao Lin, Lijia Yu, Gaojie Jin, Renjue Li, Peng Wu, Lijun Zhang
PDF
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation Zhenyu Wang, Ya-Li Li, Taichi Liu, Hengshuang Zhao, Shengjin Wang
PDF
Overcome Modal Bias in Multi-Modal Federated Learning via Balanced Modality Selection Yunfeng Fan, Wenchao Xu, Haozhao Wang, Fushuo Huo, Jinyu Chen, Song Guo
PDF
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks Cheeun Hong, Kyoung Mu Lee
PDF
OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks Jingyang Xiang, Zuohui Chen, Siqi Li, Qing Wu, Yong Liu
PDF
PACE: Pose Annotations in Cluttered Environments Yang You, Kai Xiong, Zhening Yang, Zhengxiang Huang, Junwei Zhou, Ruoxi Shi, Zhou Fang, Adam Harley, Leonidas Guibas, Cewu Lu
PDF
PairingNet: A Learning-Based Pair-Searching and -matching Network for Image Fragments Rixin Zhou, Ding Xia, Yi Zhang, Honglin Pang, Xi Yang, Chuntao Li
PDF
Pairwise Distance Distillation for Unsupervised Real-World Image Super-Resolution Yuehan Zhang, Seungjun Lee, Angela Yao
PDF
PALM: Predicting Actions Through Language Models Sanghwan Kim, Daoji Huang, Yongqin Xian, Otmar Hilliges, Luc Van Gool, Xi Wang
PDF
Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration Youngjin Oh, Keuntek Lee, Jooyoung Lee, Dae-Hyun Lee, Nam Ik Cho
PDF
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion Guansong Lu, Yuanfan Guo, Jianhua Han, Minzhe Niu, Yihan Zeng, Songcen Xu, Zeyi Huang, Zhao Zhong, Wei Zhang, Hang Xu
PDF
PanoFree: Tuning-Free Holistic Multi-View Image Generation with Cross-View Self-Guidance Aoming Liu, Zhong Li, Zhang Chen, Nannan Li, Yi Xu, Bryan Plummer
PDF
PanoVOS: Bridging Non-Panoramic and Panoramic Views with Transformer for Video Segmentation Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei Zhang
PDF
PapMOT: Exploring Adversarial Patch Attack Against Multiple Object Tracking Jiahuan Long, Tingsong Jiang, Wen Yao, Shuai Jia, Weijia Zhang, Weien Zhou, Chao Ma, Xiaoqian Chen
PDF
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu, Diana Marculescu
PDF
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach Taolin Zhang, Jiawang Bai, Zhihe Lu, Dongze Lian, Genping Wang, Xinchao Wang, Shu-Tao Xia
PDF
Parameterization-Driven Neural Surface Reconstruction for Object-Oriented Editing in Neural Rendering Baixin Xu, Jiangbei Hu, Fei Hou, Kwan-Yee Lin, Wayne Wu, Chen Qian, Ying He
PDF
Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer Xueyi Liu, Kangbo Lyu, Jieqiong Zhang, Tao Du, Li Yi
PDF
ParCo: Part-Coordinating Text-to-Motion Synthesis Qiran Zou, Shangyuan Yuan, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, Xiangyang Ji
PDF
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration Runzhao Yao, Shaoyi Du, Wenting Cui, Canhui Tang, Chengwu Yang
PDF
PARIS3D: Reasoning-Based 3D Part Segmentation Using Large Multimodal Model Amrin Kareem, Jean Lahoud, Hisham Cholakkal
PDF
Parrot Captions Teach CLIP to Spot Text Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou
PDF
Parrot: Pareto-Optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang
PDF
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation Cheng Shi, Yulin Zhang, Bin Yang, Jiajin Tang, Yuexin Ma, Sibei Yang
PDF
PartCraft: Crafting Creative Objects by Parts Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
PDF
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects Junyi Li, Junfeng Wu, Weizhi Zhao, Song Bai, Xiang Bai
PDF
PartImageNet++ Dataset: Scaling up Part-Based Models for Robust Recognition Xiao Li, Yining Liu, Na Dong, Sitian Qin, Xiaolin Hu
PDF
PartSTAD: 2D-to-3D Part Segmentation Task Adaptation Hyunjin Kim, Minhyuk Sung
PDF
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation Zhenyu Li, Shariq Farooq Bhat, Peter Wonka
PDF
Pathformer3D: A 3D Scanpath Transformer for 360° Images Rong Quan, Yantao Lai, Mengyu Qiu, Dong Liang
PDF
PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology Yuxuan Sun, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin, Lin Yang
PDF
Pathology-Knowledge Enhanced Multi-Instance Prompt Learning for Few-Shot Whole Slide Image Classification Linhao Qu, Dingkang Yang, Dan Huang, Qinhao Guo, Rongkui Luo, Shaoting Zhang, Xiaosong Wang
PDF
PAV: Personalized Head Avatar from Unstructured Video Collection Akin Caliskan, Berkay Kicanaoglu, Hyeongwoo Kim
PDF
Paying More Attention to Images: A Training-Free Method for Alleviating Hallucination in LVLMs Shi Liu, Kecheng Zheng, Wei Chen
PDF
PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion Runsong Zhu, Shi Qiu, Qianyi Wu, Ka-Hei Hui, Pheng-Ann Heng, Chi-Wing Fu
PDF
PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers Ananthu Aniraj, Cassio F. Dantas, Dino Ienco, Diego Marcos
PDF
PDT Uav Target Detection Dataset for Pests and Diseases Tree Mingle Zhou, Rui Xing, Delong Han, Zhiyong Qi, Gang Li
PDF
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in Non-English Text-to-Image Generation Jian Ma, Chen Chen, Qingsong Xie, Haonan Lu
PDF
Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting Jeongmin Bae, Seoha Kim, Youngsik Yun, Hahyun Lee, Gun Bang, Youngjung Uh
PDF
Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers’ Opinion Scores Lucas Goncalves, Prashant Mathur, Chandrashekhar Lavania, Metehan Cekic, Marcello Federico, Kyu Han
PDF
Personalized Federated Domain-Incremental Learning Based on Adaptive Knowledge Matching Yichen Li, Wenchao Xu, Haozhao Wang, Yining Qi, Jingcai Guo, Ruixuan Li
PDF
Personalized Privacy Protection Mask Against Unauthorized Facial Recognition Ka-Ho Chow, Sihao Hu, Tiansheng Huang, Ling Liu
PDF
Personalized Video Relighting with an At-Home Light Stage Jun Myeong Choi, Max Christman, Roni Sengupta
PDF
PetFace: A Large-Scale Dataset and Benchmark for Animal Identification Risa Shinoda, Kaede Shiohara
PDF
PFedEdit: Personalized Federated Learning via Automated Model Editing Haolin Yuan, William Paul, John Aucott, Philippe Burlina, Yinzhi Cao
PDF
PFGS: High Fidelity Point Cloud Rendering via Feature Splatting Jiaxu Wang, Zhang Ziyi, Junhao He, Renjing Xu
PDF
Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon, Kuk-Jin Yoon
PDF
Photon Inhibition for Energy-Efficient Single-Photon Imaging Lucas J Koerner, Shantanu Gupta, Atul N Ingle, Mohit Gupta
PDF
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering Ruofan Liang, Zan Gojcic, Merlin Nimier-David, David Acuna, Nandita Vijaykumar, Sanja Fidler, Zian Wang
PDF
Photorealistic Video Generation with Diffusion Models Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, Jose Lezama
PDF
PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations Yang Zheng, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein
PDF
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, Shenlong Wang
PDF
Physical-Based Event Camera Simulator Haiqian Han, Jiacheng Lyu, Jianing Li, Henglu Wei, Cheng Li, Yajing Wei, Shu Chen, Xiangyang Ji
PDF
Physically Plausible Color Correction for Neural Radiance Fields Qi Zhang, Ying Feng, Hongdong Li
PDF
Physics-Based Interaction with 3D Objects via Video Generation Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman
PDF
Physics-Free Spectrally Multiplexed Photometric Stereo Under Unknown Spectral Composition Satoshi Ikehata, Yuta Asano
PDF
Physics-Informed Knowledge Transfer for Underwater Monocular Depth Estimation Jinghe Yang, Mingming Gong, Ye Pu
PDF
Pick-a-Back: Selective Device-to-Device Knowledge Transfer in Federated Continual Learning HyungJune Lee, JinYi Yoon
PDF
PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning Haiyang Guo, Fei Zhu, Wenzhuo Liu, Xu-Yao Zhang, Cheng-Lin Liu
PDF
PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects Guangcheng Chen, Yicheng He, Li He, Hong Zhang
PDF
PiTe: Pixel-Temporal Alignment for Large Video-Language Model Yang Liu, Pengxiang Ding, Siteng Huang, Min Zhang, Han Zhao, Donglin Wang
PDF
Pix2Gif: Motion-Guided Diffusion for GIF Generation Hitesh Kandala, Jianfeng Gao, Jianwei Yang
PDF
PixArt-Sigma: Weak-to-Strong Training of Diffusion Transformer for 4k Text-to-Image Generation Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li
PDF
Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization Tao Yang, Rongyuan Wu, Peiran Ren, Xuansong Xie, Lei Zhang
PDF
Pixel-GS Density Control with Pixel-Aware Gradient for 3D Gaussian Splatting Zheng Zhang, Wenbo Hu, Yixing Lao, Tong He, Hengshuang Zhao
PDF
PixOOD: Pixel-Level Out-of-Distribution Detection Tomas Vojir, Jan Sochman, Jiri Matas
PDF
Placing Objects in Context via Inpainting for Out-of-Distribution Segmentation Pau de Jorge Aranda, Riccardo Volpi, Puneet Dokania, Philip Torr, Gregory Rogez
PDF
Plain-Det: A Plain Multi-Dataset Object Detector Cheng Shi, Yuchen Zhu, Sibei Yang
PDF
Plan, Posture and Go: Towards Open-Vocabulary Text-to-Motion Generation Jinpeng Liu, Wenxun Dai, Chunyu Wang, Yiji Cheng, Yansong Tang, Xin Tong
PDF
Platypus: A Generalized Specialist Model for Reading Text in Various Forms Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang, Cong Yao
PDF
PLOT: Text-Based Person Search with Part Slot Attention for Corresponding Part Discovery Jicheol Park, Dongwon Kim, Boseung Jeong, Suha Kwak
PDF
Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception Tianyou Luo, Quan Yuan, Yuchen Xia, Guiyang Luo, Yujia Yang, Jinglin Li
PDF
Plug-and-Play Learned Proximal Trajectory for 3D Sparse-View X-Ray Computed Tomography Romain Vo, Julie Escoda, Caroline Vienne, Etienne Decenciere
PDF
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation Ning Gao, Sanping Zhou, Le Wang, Nanning Zheng
PDF
POA: Pre-Training Once for Models of All Sizes Yingying Zhang, Xin Guo, Jiangwei Lao, Lei Yu, Lixiang Ru, Jian Wang, Guo Ye, Huimei He, Jingdong Chen, Ming Yang
PDF
POCA: Post-Training Quantization with Temporal Alignment for Codec Avatars Jian Meng, Yuecheng Li, Leo Li, Syed Shakib Sarwar, Dilin Wang, Jae-sun Seo
PDF
POET: Prompt Offset Tuning for Continual Human Action Adaptation Prachi Garg, K J Joseph, Vineeth N Balasubramanian, Necati Cihan Camgoz, Chengde Wan, Kenrick Kin, Weiguang Si, Shugao Ma, Fernando de la Torre
PDF
Point-Supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance Jing Li, Junsong Fan, Zhaoxiang Zhang
PDF
PointLLM: Empowering Large Language Models to Understand Point Clouds Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
PDF
PointNeRF++: A Multi-Scale, Point-Based Neural Radiance Field Weiwei Sun, Eduard Trulls, Yang-Che Tseng, Sneha Sambandam, Gopal Sharma, Andrea Tagliasacchi, Kwang Moo Yi
PDF
PointRegGPT: Boosting 3D Point Cloud Registration Using Generative Point-Cloud Pairs for Training Suyi Chen, Hao Xu, Haipeng Li, Kunming Luo, Guanghui Liu, Chi-Wing Fu, Ping Tan, Shuaicheng Liu
PDF
PolyOculus: Simultaneous Multi-View Image-Based Novel View Synthesis Jason J. Yu, Tristan Aumentado-Armstrong, Fereshteh Forghani, Konstantinos G. Derpanis, Marcus A. Brubaker
PDF
PolyRoom: Room-Aware Transformer for Floorplan Reconstruction Yuzhou Liu, Lingjie Zhu, Xiaodong Ma, Hanqiao Ye, Xiang Gao, Xianwei Zheng, Shuhan Shen
PDF
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos Keqiang Sun, Dor Litvak, Yunzhi Zhang, Hongsheng Li, Jiajun Wu, Shangzhe Wu
PDF
Portrait4D-V2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer Yu Deng, Duomin Wang, Baoyuan Wang
PDF
Pose Guided Fine-Grained Sign Language Video Generation Tongkai Shi, Lianyu Hu, Fanhua Shang, Jichao Feng, Liu Peidong, Wei Feng
PDF
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization Jiayun Wang, Yubei Chen, Stella X. Yu
PDF
PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-Based Motion Capture Zhuojun Li, Chun Yu, Chen Liang, Yuanchun Shi
PDF
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control Yong Zhong, Min Zhao, Zebin You, Xiaofeng Yu, Changwang Zhang, Chongxuan Li
PDF
PoseEmbroider: Towards a 3D, Visual, Semantic-Aware Human Pose Representation Ginger Delmas, Philippe Weinzaepfel, Francesc Moreno-Noguer, Gregory Rogez
PDF
PoseSOR: Human Pose Can Guide Our Attention Huankang Guan, Rynson W.H. Lau
PDF
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang
PDF
Post-Training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models Siao Tang, Xin Wang, Hong Chen, Chaoyu Guan, Zewen Wu, Yansong Tang, Wenwu Zhu
PDF
PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation Jaejung Seol, SeoJun Kim, Jaejun Yoo
PDF
Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment Simon Weber, Je Hyeong Hong, Daniel Cremers
PDF
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning Fanyue Wei, Wei Zeng, Zhenyang Li, Dawei Yin, Lixin Duan, Wen Li
PDF
PPAD: Iterative Interactions of Prediction and Planning for End-to-End Autonomous Driving Zhili Chen, Maosheng Ye, Shuangjie Xu, Tongyi Cao, Qifeng Chen
PDF
PQ-SAM: Post-Training Quantization for Segment Anything Model Xiaoyu Liu, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhijun Tu, Jie Hu, Hanting Chen, Baoqun Yin, Zhiwei Xiong
PDF
Pre-Trained Visual Dynamics Representations for Efficient Policy Learning Hao Luo, Bohan Zhou, Zongqing Lu
PDF
PreciseControl: Enhancing Text-to-Image Diffusion Models with Fine-Grained Attribute Control Rishubh Parihar, Sachidanand Vs, Sabariswaran Mani, Tejan Karmali, Venkatesh Babu Radhakrishnan
PDF
PredBench: Benchmarking Spatio-Temporal Prediction Across Diverse Disciplines ZiDong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai
PDF
Prediction Exposes Your Face: Black-Box Model Inversion via Prediction Alignment Yufan Liu, Wanqian Zhang, Dayan Wu, Zheng Lin, Jingzi Gu, Weiping Wang
PDF
PreLAR: World Model Pre-Training with Learnable Action Representation Lixuan Zhang, Meina Kan, Shiguang Shan, Xilin Chen
PDF
PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors Tianyuan Yuan, Yucheng Mao, Jiawei Yang, Yicheng Liu, Yue Wang, Hang Zhao
PDF
PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation Renjie Lu, Jingke Meng, Wei-Shi Zheng
PDF
Preventing Catastrophic Forgetting Through Memory Networks in Continuous Detection Gaurav Bhatt, Leonid Sigal, James Ross
PDF
Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-Level Optimization Perspective Zhaoxin Wang, Handing Wang, Cong Tian, Yaochu Jin
PDF
Prioritized Semantic Learning for Zero-Shot Instance Navigation Xinyu Sun, Lizhao Liu, Hongyan Zhi, Ronghe Qiu, Junwei Liang
PDF
Privacy-Preserving Adaptive Re-Identification Without Image Transfer Hamza Rami, Jhony H. Giraldo, Nicolas Winckler, Stéphane Lathuilière
PDF
Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization Xi Yang, Songsong Duan, Nannan Wang, Xinbo Gao
PDF
Probabilistic Image-Driven Traffic Modeling via Remote Sensing Scott Workman, Armin Hadzic
PDF
Probabilistic Weather Forecasting with Deterministic Guidance-Based Diffusion Model Donggeun Yoon, Minseok Seo, Doyi Kim, Yeji Choi, Donghyeon Cho
PDF
ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative Generation Jack Lu, Ryan Teehan, Mengye Ren
PDF
ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee
PDF
Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds Zicheng Wang, Zhen Zhao, Yiming Wu, Luping Zhou, Dong Xu
PDF
Progressive Pretext Task Learning for Human Trajectory Prediction Xiaotong Lin, Tianming Liang, Jianhuang Lai, Jian-Fang Hu
PDF
Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo
PDF
Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation Zeyang Zhao, Qilong Xue, Yifan Bai, Yuhang He, Xing Wei, Yihong Gong
PDF
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation Dylan J Li, Gyungin Shin
PDF
Prompt-Based Test-Time Real Image Dehazing: A Novel Pipeline Zixuan Chen, Zewei He, Ziqian Lu, Xuecheng Sun, Zheming Lu
PDF
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon
PDF
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery Fernando Julio Cendra, Bingchen Zhao, Kai Han
PDF
PromptFusion: Decoupling Stability and Plasticity for Continual Learning Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang
PDF
Prompting Future Driven Diffusion Model for Hand Motion Prediction Bowen Tang, Kaihao Zhang, Wenhan Luo, Wei Liu, Hongdong Li
PDF
Prompting Language-Informed Distribution for Compositional Zero-Shot Learning Wentao Bao, Lichang Chen, Heng Huang, Yu Kong
PDF
PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts Zewen Chen, Haina Qin, Juan Wang, Chunfeng Yuan, Bing Li, Weiming Hu, Leon Wang
PDF
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang, Fu-Jen Chu, Kris Kitani, Gedas Bertasius, Xitong Yang
PDF
ProSub: Probabilistic Open-Set Semi-Supervised Learning with Subspace-Based Out-of-Distribution Detection Erik Wallin, Lennart Svensson, Fredrik Kahl, Lars Hammarstrand
PDF
Protecting NeRFs' Copyright via Plug-and-Play Watermarking Base Model Qi Song, Ziyuan Luo, Ka Chun Cheung, Simon See, Renjie Wan
PDF
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models Against Stochastic Perturbation Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul A Jennings, Xingyu Zhao
PDF
ProtoComp: Diverse Point Cloud Completion with Controllable Prototype Xumin Yu, Yanbo Wang, Jie Zhou, Jiwen Lu
PDF
ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang
PDF
PSALM: Pixelwise Segmentation with Large Multi-Modal Model Zheng Zhang, Yeyao Ma, Enming Zhang, Xiang Bai
PDF
Pseudo-Embedding for Generalized Few-Shot Point Cloud Segmentation Chih-Jung Tsai, Hwann-Tzong Chen, Tyng-Luh Liu
PDF
Pseudo-Keypoint RKHS Learning for Self-Supervised 6DoF Pose Estimation Yangzheng Wu, Michael Alan Greenspan
PDF
Pseudo-Labelling Should Be Aware of Disguising Channel Activations Changrui Chen, Kurt Debattista, Jungong Han
PDF
Pseudo-RIS: Distinctive Pseudo-Supervision Generation for Referring Image Segmentation Seonghoon Yu, Paul Hongsuck Seo, Jeany Son
PDF
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos Mi Luo, Zihui Xue, Alex Dimakis, Kristen Grauman
PDF
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
PDF
Pyramid Diffusion for Fine 3D Large Scene Generation Yuheng Liu, Xinke Li, Xueting Li, Lu Qi, Chongshou Li, Ming-Hsuan Yang
PDF
Q&A Prompts: Discovering Rich Visual Clues Through Mining Question-Answer Prompts for VQA Requiring Diverse World Knowledge Haibo Wang, Weifeng Ge
PDF
Quality Assured: Rethinking Annotation Strategies in Imaging AI Tim Rädsch, Annika Reinke, Vivienn Weru, Minu D. Tizabi, Nicholas Heller, Fabian Isensee, Annette Kopp-Schneider, Lena Maier-Hein
PDF
Quanta Video Restoration Prateek Chennuri, Yiheng Chi, Enze Jiang, GM Dilshan Godaliyadda, Abhiram Gnanasambandam, Hamid R Sheikh, Istvan Gyongy, Stanley H Chan
PDF
Quantization-Friendly Winograd Transformations for Convolutional Neural Networks Vladimir Protsenko, Vladimir Kryzhanovskiy, Alexander Filippov
PDF
Quantized Prompt for Efficient Generalization of Vision-Language Models Tianxiang Hao, Xiaohan Ding, Juexiao Feng, Yuhong Yang, Hui Chen, Guiguang Ding
PDF
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots Pengxiang Ding, Han Zhao, Wenjie Zhang, Wenxuan Song, Min Zhang, Siteng Huang, Ningxi Yang, Donglin Wang
PDF
QueryCDR: Query-Based Controllable Distortion Rectification Network for Fisheye Images Pengbo Guo, Chengxu Liu, Xingsong Hou, Xueming Qian
PDF
R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model Changhoon Kim, Kyle Min, Yezhou Yang
PDF
R^2-Bench: Benchmarking the Robustness of Referring Perception Models Under Perturbations Xiang Li, Kai Qiu, Jinglu Wang, Xiaohao Xu, Kashu Yamazaki, Hao Chen, Rita Singh, Xiaonan Huang, Bhiksha Raj
PDF
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding Ye Liu, Jixuan He, Wanhua Li, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen
PDF
R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection Zheyuan Zhou, Le Wang, Naiyu Fang, Zili Wang, Lemiao Qiu, Shuyou Zhang
PDF
R3DS: Reality-Linked 3D Scenes for Panoramic Scene Understanding Qirui Wu, Sonia Raychaudhuri, Daniel Ritchie, Manolis Savva, Angel X Chang
PDF
RadEdit: Stress-Testing Biomedical Vision Models via Diffusion Image Editing Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez, Boris van Breugel, Daniel Coelho de Castro, Harshita Sharma, Valentina Salvatelli, Maria Teodora A Wetscherek, Hannah CM Richardson, Lungren Matthew, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse
PDF
Radiative Gaussian Splatting for Efficient X-Ray Novel View Synthesis Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille
PDF
RaFE: Generative Radiance Fields Restoration Zhongkai Wu, Ziyu Wan, Jing Zhang, Jing Liao, Dong Xu
PDF
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal Yeying Jin, Xin Li, Jiadong Wang, Yan Zhan, Malu Zhang
PDF
Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching Xiaoyong Lu, Songlin Du
PDF
Random Walk on Pixel Manifolds for Anomaly Segmentation of Complex Driving Scenes Zelong Zeng, Kaname Tomite
PDF
RangeLDM: Fast Realistic LiDAR Point Cloud Generation Qianjiang Hu, Zhimin Zhang, Wei Hu
PDF
RANRAC: Robust Neural Scene Representations via Random Ray Consensus Benno Buschmann, Andreea Dogaru, Elmar Eisemann, Michael Weinmann, Bernhard Egger
PDF
RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos Ali Zare, Yulei Niu, Hammad Ayyubi, Shih-Fu Chang
PDF
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation Li Li, Hubert P. H. Shum, Toby P Breckon
PDF
Rasterized Edge Gradients: Handling Discontinuities Differentially Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz, He Wen, Yaser Sheikh, Jason Saragih
PDF
Rate-Distortion-Cognition Controllable Versatile Neural Image Compression Jinming Liu, Ruoyu Feng, Yunpeng Qi, Qiuyu Chen, Zhibo Chen, Wenjun Zeng, Xin Jin
PDF
RAVE: Residual Vector Embedding for CLIP-Guided Backlit Image Enhancement Tatiana Gaintseva, Martin Benning, Gregory Slabaugh
PDF
RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images Ziteng Cui, Tatsuya Harada
PDF
Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs Georgy Perevozchikov, Nancy Mehta, Mahmoud Afifi, Radu Timofte
PDF
Ray Denoising: Depth-Aware Hard Negative Sampling for Multi-View 3D Object Detection Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou
PDF
Ray-Distance Volume Rendering for Neural Scene Reconstruction Ruihong Yin, Yunlu Chen, Sezer Karaoglu, Theo Gevers
PDF
RCS-Prompt: Learning Prompt to Rearrange Class Space for Prompt-Based Continual Learning Longrong Yang, Hanbin Zhao, Yunlong Yu, Xiaodong Zeng, Xi Li
PDF
Real Appearance Modeling for More General Deepfake Detection Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jiao Dai, Yesheng Chai, Jizhong Han
PDF
Real-Data-Driven 2000 FPS Color Video from Mosaicked Chromatic Spikes Siqi Yang, Zhaojun Huang, Yakun Chang, Bin Fan, Zhaofei Yu, Boxin Shi
PDF
Real-Time 3D-Aware Portrait Editing from a Single Image Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen
PDF
Real-Time Holistic Robot Pose Estimation with Unknown States Shikun Ban, Juling Fan, Xiaoxuan Ma, Wentao Zhu, Yu Qiao, Yizhou Wang
PDF
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments Taewoong Kim, Cheolhong Min, Byeonghwi Kim, Jinyeon Kim, Wonje Jeung, Jonghyun Choi
PDF
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios Wenhao Ding, Yulong Cao, Ding Zhao, Chaowei Xiao, Marco Pavone
PDF
Realistic Human Motion Generation with Cross-Diffusion Models Zeping Ren, Shaoli Huang, Xiu Li
PDF
RealViformer: Investigating Attention for Real-World Video Super-Resolution Yuehan Zhang, Angela Yao
PDF
Reason2Drive: Towards Interpretable and Chain-Based Reasoning for Autonomous Driving Ming Nie, Renyuan Peng, Chunwei Wang, Xinyue Cai, Jianhua Han, Hang Xu, Li Zhang
PDF
Rebalancing Using Estimated Class Distribution for Imbalanced Semi-Supervised Learning Under Class Distribution Mismatch Taemin Park, Hyuck Lee, Heeyoung Kim
PDF
Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
PDF
ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories Chen-Yi Lu, Shubham Agarwal, Md Mehrab Tanjim, Kanak Mahadik, Anup Rao, Subrata Mitra, Shiv K Saini, Saurabh Bagchi, Somali Chaterji
PDF
Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians Licheng Zhong, Hong-Xing Yu, Jiajun Wu, Yunzhu Li
PDF
Rectify the Regression Bias in Long-Tailed Object Detection Ke Zhu, Minghao Fu, Jie Shao, Tianyu Liu, Jianxin Wu
PDF
RecurrentBEV: A Long-Term Temporal Fusion Framework for Multi-View 3D Detection Ming Chang, Xishan Zhang, Rui Zhang, Zhipeng Zhao, Guanhua He, Shaoli Liu
PDF
Recursive Visual Programming Jiaxin Ge, Sanjay Subramanian, Baifeng Shi, Roei Herzig, Trevor Darrell
PDF
REDIR: Refocus-Free Event-Based De-Occlusion Image Reconstruction Qi Guo, Hailong Shi, Huan Li, Jinsheng Xiao, Xingyu Gao
PDF
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes Yaoting Wang, Peiwen Sun, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu
PDF
Referring Atomic Video Action Recognition Kunyu Peng, Jia Fu, Kailun Yang, Di Wen, Yufan Chen, Ruiping Liu, Junwei Zheng, Jiaming Zhang, Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
PDF
Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction Shuchi Wu, Chuan Ma, Kang Wei, Xiaogang Xu, Ming Ding, Yuwen Qian, Di Xiao, Tao Xiang
PDF
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models Jinrui Zhang, Teng Wang, Haigang Zhang, Ping Lu, Feng Zheng
PDF
REFRAME: Reflective Surface Real-Time Rendering for Mobile Devices Chaojie Ji, Yufeng Li, Yiyi Liao
PDF
Region-Adaptive Transform with Segmentation Prior for Image Compression Yuxi Liu, Wenhan Yang, Huihui Bai, Yunchao Wei, Yao Zhao
PDF
Region-Aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Jie Zou
PDF
Region-Aware Sequence-to-Sequence Learning for Hyperspectral Denoising JiaHua Xiao, Yang Liu, Xing Wei
PDF
Region-Centric Image-Language Pretraining for Open-Vocabulary Detection Dahun Kim, Anelia Angelova, Weicheng Kuo
PDF
Region-Native Visual Tokenization Mengyu Wang, Yuyao Huang, Henghui Ding, Xinlong Wang, Tiejun Huang, Yao Zhao, Yunchao Wei, Shuicheng Yan
PDF
RegionDrag: Fast Region-Based Image Editing with Diffusion Models Jingyi Lu, Xinghui Li, Kai Han
PDF
ReGround: Improving Textual and Spatial Grounding at No Cost Phillip Y. Lee, Minhyuk Sung
PDF
Regularizing Dynamic Radiance Fields with Kinematic Fields Woobin Im, Geonho Cha, Sebin Lee, Jumin Lee, Juhyeong Seon, Dongyoon Wee, Sungeui Yoon
PDF
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density Peiyu Yang, Naveed Akhtar, Mubarak Shah, Ajmal Mian
PDF
Reinforcement Learning Friendly Vision-Language Model for Minecraft Haobin Jiang, Junpeng Yue, Hao Luo, Ziluo Ding, Zongqing Lu
PDF
Reinforcement Learning Meets Visual Odometry Nico Messikommer, Giovanni Cioffi, Mathias Gehrig, Davide Scaramuzza
PDF
Reinforcement Learning via Auxillary Task Distillation Abhinav N Harish, Larry Heck, Josiah P Hanna, Zsolt Kira, Andrew Szot
PDF
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis Chirag Vashist, Shichong Peng, Ke Li
PDF
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan
PDF
Reliability in Semantic Segmentation: Can We Use Synthetic Data? Thibaut Loiseau, Tuan-Hung Vu, Mickael Chen, Patrick Pérez, Matthieu Cord
PDF
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models Chao Gong, Kai Chen, Zhipeng Wei, Jingjing Chen, Yu-Gang Jiang
PDF
Reliable Spatial-Temporal Voxels for Multi-Modal Test-Time Adaptation Haozhi Cao, Yuecong Xu, Jianfei Yang, Pengyu Yin, Xingyu Ji, Shenghai Yuan, Lihua Xie
PDF
Relightable 3D Gaussians: Realistic Point Cloud Relighting with BRDF Decomposition and Ray Tracing Jian Gao, Chun Gu, Youtian Lin, Zhihao Li, Hao Zhu, Xun Cao, Li Zhang, Yao Yao
PDF
Relightable Neural Actor with Intrinsic Decomposition and Pose Control Diogo Carbonera Luvizon, Vladislav Golyanik, Adam Kortylewski, Marc Habermann, Christian Theobalt
PDF
ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild Chen Guo, Tianjian Jiang, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song, Otmar Hilliges
PDF
ReMamber: Referring Image Segmentation with Mamba Twister Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang
PDF
ReMatching: Low-Resolution Representations for Scalable Shape Correspondence Filippo Maggioli, Daniele Baieri, Emanuele Rodola, Simone Melzi
PDF
ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek
PDF
Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry Shengjie Zhu, Girish Chandar Ganesan, Abhinav Kumar, Xiaoming Liu
PDF
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment Mu Cai, Haotian Liu, Yuheng Li, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, Krishna Kumar Singh
PDF
Removing Rows and Columns of Tokens in Vision Transformer Enables Faster Dense Prediction Without Retraining Diwei Su, Cheng Fei, Jianxu Luo
PDF
ReNoise: Real Image Inversion Through Iterative Noising Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Danny Cohen-Or
PDF
Repaint123: Fast and High-Quality One Image to 3D Generation with Progressive Controllable Repainting Junwu Zhang, Zhenyu Tang, Yatian Pang, Xinhua Cheng, Peng Jin, Yida Wei, Xing Zhou, Munan Ning, Li Yuan
PDF
RePOSE: 3D Human Pose Estimation via Spatio-Temporal Depth Relational Consistency Ziming Sun, Yuan Liang, Zejun Ma, Tianle Zhang, Linchao Bao, Guiqing Li, Shengfeng He
PDF
Representation Enhancement-Stabilization: Reducing Bias-Variance of Domain Generalization Wei Huang, Yilei Shi, Zhitong Xiong, Xiao Xiang Zhu
PDF
Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures Jiaxing Huang, Yanfeng Zhou, Yaoru Luo, Guole Liu, Heng Guo, Ge Yang
PDF
Reprojection Errors as Prompts for Efficient Scene Coordinate Regression Ting-Ru Liu, Hsuan-Kung Yang, Jou-Min Liu, Chun-Wei Huang, Tsung-Chih Chiang, Quan Kong, Norimasa Kobori, Chun-Yi Lee
PDF
RepVF: A Unified Vector Fields Representation for Multi-Task 3D Perception Jianbing Shen, Chunliang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao
PDF
Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation Zhilin Zhu, Xiaopeng Hong, Zhiheng Ma, Weijun Zhuang, YaoHui Ma, Yong Dai, Yaowei Wang
PDF
Resilience of Entropy Model in Distributed Neural Networks Milin Zhang, Mohammad Abdi, Shahriar Rifat, Francesco Restuccia
PDF
Resolving Scale Ambiguity in Multi-View 3D Reconstruction Using Dual-Pixel Sensors Kohei Ashida, Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita
PDF
Responsible Visual Editing Minheng Ni, Yeli Shen, Lei Zhang, Wangmeng Zuo
PDF
Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration Chujie Qin, Ruiqi Wu, Zikun Liu, Xin Lin, Chun-Le Guo, Hyun Hee Park, Chongyi Li
PDF
Restoring Images in Adverse Weather Conditions via Histogram Transformer Shangquan Sun, Wenqi Ren, Xinwei Gao, Rui Wang, Xiaochun Cao
PDF
ReSyncer: Rewiring Style-Based Generator for Unified Audio-Visually Synced Facial Performer Jiazhi Guan, Zhiliang Xu, Hang Zhou, Kaisiyuan Wang, Shengyi He, Zhanwang Zhang, Borong Liang, Haocheng Feng, Errui Ding, Jingtuo Liu, Jingdong Wang, Youjian Zhao, Ziwei Liu
PDF
Retargeting Visual Data with Deformation Fields Tim Elsner, Julia Berger, Tong Wu, Victor Czech, Lin Gao, Leif Kobbelt
PDF
Rethinking and Improving Visual Prompt Selection for In-Context Learning Segmentation Framework Wei Suo, Lanqing Lai, Mengyang Sun, Hanwang Zhang, Peng Wang, Yanning Zhang
PDF
Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather Junsung Park, Kyungmin Kim, Hyunjung Shim
PDF
Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-Wise Hidden Bias Jinhyeok Jang, ByungOk Han, Jaehong Kim, Chan-Hyun Youn
PDF
Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction Bingyu Xin, Meng Ye, Leon Axel, Dimitris N. Metaxas
PDF
Rethinking Directional Parameterization in Neural Implicit Surface Reconstruction Zijie Jiang, Tianhan Xu, Hiroharu Kato
PDF
Rethinking Fast Adversarial Training: A Splitting Technique to Overcome Catastrophic Overfitting Masoumeh Zareapoor, Pourya Shamsolmoali
PDF
Rethinking Features-Fused-Pyramid-Neck for Object Detection Hulin Li
PDF
Rethinking Few-Shot Class-Incremental Learning: Learning from Yourself Yu-Ming Tang, Yi-Xing Peng, Jingke Meng, Wei-Shi Zheng
PDF
Rethinking Image Super Resolution from Training Data Perspectives Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki
PDF
Rethinking Image-to-Video Adaptation: An Object-Centric Perspective Rui Qian, Shuangrui Ding, Dahua Lin
PDF
Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains Jaeyeul Kim, Jungwan Woo, Jeonghoon Kim, Sunghoon Im
PDF
Rethinking Normalization Layers for Domain Generalizable Person Re-Identification Ren Nie, Jin Ding, Xue Zhou, Xi Li
PDF
Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou
PDF
Rethinking Unsupervised Outlier Detection via Multiple Thresholding Zhonghang Liu, Panzhong Lu, Guoyang Xie, Zhichao Lu, Wen-Yan Lin
PDF
Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang
PDF
Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data Wufei Ma, Kai Li, Zhongshi Jiang, Moustafa Meshry, Qihao Liu, Huiyu Wang, Christian Haene, Alan Yuille
PDF
Rethinking Weakly-Supervised Video Temporal Grounding from a Game Perspective Xiang Fang, Zeyu Xiong, Wanlong Fang, Xiaoye Qu, Chen Chen, Jianfeng Dong, Keke Tang, Pan Zhou, Yu Cheng, Daizong Liu
PDF
Retrieval Robust to Object Motion Blur Rong Zou, Marc Pollefeys, Denys Rozumnyi
PDF
Revising Densification in Gaussian Splatting Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder
PDF
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models Agneet Chatterjee, Yiran Luo, Tejas Gokhale, Yezhou Yang, Chitta R Baral
PDF
Revisit Anything: Visual Place Recognition via Image Segment Retrieval Kartik Garg, Sai Shubodh, Shishir N Y Kolathaya, Madhava Krishna, Sourav Garg
PDF
Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations Zipeng Wang, Yunfan Lu, Lin Wang
PDF
Revisit Human-Scene Interaction via Space Occupancy Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, Cewu Lu
PDF
Revisit Self-Supervision with Local Structure-from-Motion Shengjie Zhu, Xiaoming Liu
PDF
Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View Jianan Fan, Dongnan Liu, Canran Li, Hang Chang, Heng Huang, Filip Braet, Mei Chen, Weidong Cai
PDF
Revisiting Calibration of Wide-Angle Radially Symmetric Cameras Andrea Porfiri Dal Cin, Francesco Azzoni, Giacomo Boracchi, Luca Magri
PDF
Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels Rui Zhao, Huibin Yan, Shuoyao Wang
PDF
Revisiting Feature Disentanglement Strategy in Diffusion Training and Breaking Conditional Independence Assumption in Sampling Wonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David Iseri Inouye, Ajinkya Kale
PDF
Revisiting Supervision for Continual Representation Learning Daniel Marczak, Sebastian Cygert, Tomasz Trzcinski, Bartlomiej Twardowski
PDF
Rgbd Gs-Icp SLAM Seongbo Ha, Jiung Yeon, Hyeonwoo Yu
PDF
RGNet: A Unified CLIP Retrieval and Grounding Network for Long Videos Tanveer Hannan, Md Mohaiminul Islam, Thomas Seidl, Gedas Bertasius
PDF
RICA^2: Rubric-Informed, Calibrated Assessment of Actions Abrar Majeedi, Viswanatha Reddy Gajjala, Satya Sai Srinath Namburi Gnvv, Yin Li
PDF
RING-NeRF : Rethinking Inductive Biases for Versatile and Efficient Neural Fields Doriand Petit, Steve Bourgeois, Dumitru Pavel, Vincent Gay-Bellile, Florian Chabot, Loïc Barthe
PDF
Risk-Aware Self-Consistent Imitation Learning for Trajectory Planning in Autonomous Driving Yixuan Fan, Ya-Li Li, Shengjin Wang
PDF
RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation Zhiyuan Zhang, Licheng Yang, Zhiyu Xiang
PDF
RoadPainter: Points Are Ideal Navigators for Topology transformER Zhongxing Ma, Liang Shuang, Yongkun Wen, Weixin Lu, Guowei Wan
PDF
Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu
PDF
Robust Calibration of Large Vision-Language Adapters Balamurali Murugesan, Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz
PDF
Robust Fitting on a Gate Quantum Computer Frances F Yang, Michele Sasdelli, Tat-Jun Chin
PDF
Robust Incremental Structure-from-Motion with Hybrid Features Shaohui Liu, Yidan Gao, Tianyi Zhang, Rémi Pautrat, Johannes L Schönberger, Viktor Larsson, Marc Pollefeys
PDF
Robust Multimodal Learning via Representation Decoupling Shicai Wei, Yang Luo, Yuji Wang, Chunbo Luo
PDF
Robust Nearest Neighbors for Source-Free Domain Adaptation Under Class Distribution Shift Antonio Tejero-de-Pablos, Riku Togashi, Mayu Otani, Shin'ichi Satoh
PDF
Robust Zero-Shot Crowd Counting and Localization with Adaptive Resolution SAM Jia Wan, Qiangqiang Wu, Wei Lin, Antoni Chan
PDF
Robust-Wide: Robust Watermarking Against Instruction-Driven Image Editing Runyi Hu, Jie Zhang, Ting Xu, Jiwei Li, Tianwei Zhang
PDF
Robustness Preserving Fine-Tuning Using Neuron Importance Guangrui Li, Rahul Duggal, Aaditya Singh, Kaustav Kundu, Bing Shuai, Jonathan Wu
PDF
Robustness Tokens: Towards Adversarial Robustness of Transformers Brian Pulfer, Yury Belousov, Slava Voloshynovskiy
PDF
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo
PDF
RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes Thang-Anh-Quan Nguyen, Luis G Roldao Jimenez, Nathan Piasco, Moussab Bennehar, Dzmitry Tsishkou
PDF
RoGUENeRF: A Robust Geometry-Consistent Universal Enhancer for NeRF Sibi Catley-Chandar, Richard Shaw, Gregory Slabaugh, Eduardo Pérez Pellitero
PDF
RoofDiffusion: Constructing Roofs from Severely Corrupted Point Data via Diffusion Kyle Shih-Huang Lo, Jorg Peters, Eric Spellman
PDF
RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting Qi Wang, Ruijie Lu, Xudong Xu, Jingbo Wang, Michael Yu Wang, Bo Dai, Gang Zeng, Dan Xu
PDF
RoScenes: A Large-Scale Multi-View 3D Dataset for Roadside Perception Xiaosu Zhu, Hualian Sheng, Sijia Cai, Bing Deng, Shaopeng Yang, Qiao Liang, Ken Chen, Lianli Gao, Jingkuan Song, Jieping Ye
PDF
Rotary Position Embedding for Vision Transformer Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yun
PDF
Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation Yao Yao, Yixuan Pan, Wenjun Shi, Dongchen Zhu, Lei Wang, Jiamao Li
PDF
RPBG: Towards Robust Neural Point-Based Graphics in the Wild Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang Zheng
PDF
RS-NeRF: Neural Radiance Fields from Rolling Shutter Images Muyao Niu, Tong Chen, Yifan Zhan, Zhuoxiao Li, Xiang Ji, Yinqiang Zheng
PDF
RSL-BA: Rolling Shutter Line Bundle Adjustment Yongcong Zhang, Bangyan Liao, Yifei Xue, Lu Chen, Peidong Liu, Yizhen Lao
PDF
RT-Pose: A 4D Radar-Tensor Based 3D Human Pose Estimation and Localization Benchmark Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan, Zhongyu Jiang, Wenhao Chai, Hsiang-Wei Huang, Chih-Lung Lin, Jenq-Neng Hwang
PDF
S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition Mohamed Abdelfattah, Alexandre Alahi
PDF
S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis Dongze Li, Kang Zhao, Wei Wang, Yifeng Ma, Bo Peng, Yingya Zhang, Jing Dong
PDF
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu, Chih-Yuan Yang, Jane Yung-jen Hsu
PDF
SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation Sayan Nag, Koustava Goswami, Srikrishna Karanam
PDF
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
PDF
Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries Wei-Jer Chang, Francesco Pittaluga, Masayoshi Tomizuka, Wei Zhan, Manmohan Chandraker
PDF
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin, Juho Lee
PDF
SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging Lingtong Kong, Bo Li, Yike Xiong, Hao Zhang, Hong Gu, Jinwei Chen
PDF
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning Bac Nguyen, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville
PDF
SAGS: Structure-Aware 3D Gaussian Splatting Evangelos Ververas, Rolandos Alexandros Potamias, Jifei Song, Jiankang Deng, Stefanos Zafeiriou
PDF
SAH-SCI: Self-Supervised Adapter for Efficient Hyperspectral Snapshot Compressive Imaging Haijin Zeng, Yuxi Liu, Yongyong Chen, Youfa Liu, Chong Peng, Jingyong Su
PDF
SAIR: Learning Semantic-Aware Implicit Representation Canyu Zhang, Xiaoguang Li, Qing Guo, Song Wang
PDF
Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-Training Hyesong Choi, Hyejin Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min
PDF
SAM-COD: SAM-Guided Unified Framework for Weakly-Supervised Camouflaged Object Detection Huafeng Chen, Pengxu Wei, Guangqian Guo, Shan Gao
PDF
SAM-Guided Graph Cut for 3D Instance Segmentation Haoyu Guo, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu, Xiaowei Zhou
PDF
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation Yi-Chia Chen, Wei-Hua Li, Cheng Sun, Yu-Chiang Frank Wang, Chu-Song Chen
PDF
SAMFusion: Sensor-Adaptive Multimodal Fusion for 3D Object Detection in Adverse Weather Edoardo Palladin, Roland Dietze, Praveen Narayanan, Mario Bijelic, Felix Heide
PDF
Sapiens: Foundation for Human Vision Models Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez, Zhaoen Su, Austin T James, Peter Selednik, Stuart Anderson, Shunsuke Saito
PDF
SAVE: Protagonist Diversification with Structure Agnostic Video Editing Yeji Song, Wonsik Shin, Junsoo Lee, Jeesoo Kim, Nojun Kwak
PDF
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai
PDF
Scalable Group Choreography via Variational Phase Manifold Learning Nhat Le, Khoa Do, Xuan Bui, Tuong Do, Erman Tjiputra, Quang D.Tran, Anh Nguyen
PDF
Scalar Function Topology Divergence: Comparing Topology of 3D Objects Ilya Trofimov, Daria Voronkova, Eduard Tulchinskii, Evgeny Burnaev, Serguei Barannikov
PDF
ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation Zhiyuan Ma, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang
PDF
Scaling Backwards: Minimal Synthetic Pre-Training? Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka
PDF
Scaling up Personalized Image Aesthetic Assessment via Task Vector Customization Jooyeol Yun, Jaegul Choo
PDF
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu
PDF
ScanTalk: 3D Talking Heads from Unregistered Scans Federico Nocentini, Thomas Besnier, Claudio Ferrari, Sylvain Arguillere, Stefano Berretti, Mohamed Daoudi
PDF
SCAPE: A Simple and Strong Category-Agnostic Pose Estimator Yujia Liang, Zixuan Ye, Wenze Liu, Hao Lu
PDF
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention Chenhang He, Ruihuang Li, Guowen Zhang, Lei Zhang
PDF
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer Eric Brachmann, Jamie Wynn, Shuai Chen, Tommaso Cavallari, Aron Monszpart, Daniyar Turmukhambetov, Victor Adrian Prisacariu
PDF
Scene-Aware Human Motion Forecasting via Mutual Distance Prediction Chaoyue Xing, Wei Mao, Miaomiao Liu
PDF
Scene-Conditional 3D Object Stylization and Composition Jinghao Zhou, Tomas Jakab, Philip Torr, Christian Rupprecht
PDF
Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection Tim Salzmann, Markus Ryll, Alex Bewley, Matthias Minderer
PDF
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Daniel Barath
PDF
SceneScript: Reconstructing Scenes with an Autoregressive Structured Language Model Armen Avetisyan, Christopher Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Luke Holland, Duncan Frost, Campbell Orme, Jakob Engel, Edward Miller, Richard Newcombe, Vasileios Balntas
PDF
SceneTeller: Language-to-3D Scene Generation Basak Melis Ocal, Maxim Tatarchenko, Sezer Karaoglu, Theo Gevers
PDF
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Baoxiong Jia, Yixin Chen, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang
PDF
Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks Jing Wu, Mehrtash Harandi
PDF
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference Feng Wang, Jieru Mei, Alan Yuille
PDF
SCOD: From Heuristics to Theory Vojtech Franc, Jakub Paplham, Daniel Prusa
PDF
SCOMatch: Alleviating Overtrusting in Open-Set Semi-Supervised Learning Zerun Wang, Liuyu Xiang, Lang Huang, Jiafeng Mao, Ling Xiao, Toshihiko Yamasaki
PDF
Score Distillation Sampling with Learned Manifold Corrective Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu
PDF
SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis Huan-ang Gao, Mingju Gao, Jiaju Li, Wenyi Li, Rong Zhi, Hao Tang, Hao Zhao
PDF
SCPNet: Unsupervised Cross-Modal Homography Estimation via Intra-Modal Self-Supervised Learning Runmin Zhang, Jun Ma, Lun Luo, Beinan Yu, Shu-Jie Chen, Junwei Li, Hui-Liang Shen, Si-Yuan Cao
PDF
ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image Hallee E. Wong, Marianne Rakic, John Guttag, Adrian V. Dalca
PDF
SDPT: Synchronous Dual Prompt Tuning for Fusion-Based Visual-Language Pre-Trained Models Yang Zhou, Yongjian Wu, Jiya Saiyin, Bingzheng Wei, Maode Lai, Eric I Chang, Yan Xu
PDF
SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow Yihan Wang, Lahav O Lipson, Jia Deng
PDF
SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning Qi Qian, Yuanhong Xu, Juhua Hu
PDF
SEDiff: Structure Extraction for Domain Adaptive Depth Estimation via Denoising Diffusion Models Dongseok Shim, Hyoun Jin Kim
PDF
See and Think: Embodied Agent in Virtual Environment Zhonghan Zhao, Xuan Wang, Wenhao Chai, Boyi Li, Shengyu Hao, Shidong Cao, Tian Ye, Gaoang Wang
PDF
SEED: A Simple and Effective 3D DETR in Point Clouds Zhe Liu, Jinghua Hou, Xiaoqing Ye, Tong Wang, Jingdong Wang, Xiang Bai
PDF
Seeing Faces in Things: A Model and Dataset for Pareidolia Mark T Hamilton, Simon Stent, Vasha G DuTell, Anne Harrington, Jennifer E Corbett, Ruth Rosenholtz, William T. Freeman
PDF
Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Lishen Qu, Jufeng Yang
PDF
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving Qingwen Zhang, Yi Yang, Peizheng Li, Olov Andersson, Patric Jensfelt
PDF
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu
PDF
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation Lingchen Meng, Shiyi Lan, Hengduo Li, Jose M Alvarez, Zuxuan Wu, Yu-Gang Jiang
PDF
Segment and Recognize Anything at Any Granularity Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao
PDF
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts Jianhao Li, Tianyu Sun, Zhongdao Wang, Enze Xie, Bailan Feng, Hongbo Zhang, Ze Yuan, Ke Xu, Jiaheng Liu, Ping Luo
PDF
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation Without Manual Labels Rui Huang, Songyou Peng, Ayca Takmaz, Federico Tombari, Marc Pollefeys, Shiji Song, Gao Huang, Francis Engelmann
PDF
Segmentation-Guided Layer-Wise Image Vectorization with Gradient Fills Hengyu Zhou, Hui Zhang, Bin Wang
PDF
SegPoint: Segment Any Point Cloud via Large Language Model Shuting He, Henghui Ding, Xudong Jiang, Bihan Wen
PDF
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding Weitai Kang, Gaowen Liu, Mubarak Shah, Yan Yan
PDF
SeiT++: Masked Token Modeling Improves Storage-Efficient Training Minhyun Lee, Song Park, Byeongho Heo, Dongyoon Han, Hyunjung Shim
PDF
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
PDF
SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano, Hazel Doughty, Cees Snoek
PDF
Self-Adapting Large Visual-Language Models to Edge Devices Across Visual Modalities Kaiwen Cai, ZheKai Duan, Gaowen Liu, Charles Fleming, Chris Xiaoxuan Lu
PDF
Self-Cooperation Knowledge Distillation for Novel Class Discovery Yuzheng Wang, Zhaoyu Chen, Dingkang Yang, Yunquan Sun, Lizhe Qi
PDF
Self-Guided Generation of Minority Samples Using Diffusion Models Soobin Um, Jong Chul Ye
PDF
Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Jungwoo Kim, Wooseok Jang, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong Kim
PDF
Self-Supervised Any-Point Tracking by Contrastive Random Walks Ayush Shrivastava, Andrew Owens
PDF
Self-Supervised Audio-Visual Soundscape Stylization Tingle Li, Renhao Wang, Po-Yao Huang, Andrew Owens, Gopala Krishna Anumanchipalli
PDF
Self-Supervised Co-Salient Object Detection via Feature Correspondences at Multiple Scales Souradeep Chakraborty, Dimitris Samaras
PDF
Self-Supervised Feature Adaptation for 3D Industrial Anomaly Detection Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cairong Zhao
PDF
Self-Supervised Representation Learning for Adversarial Attack Detection Yi Li, Plamen Angelov, Neeraj Suri
PDF
Self-Supervised Shape Completion via Involution and Implicit Correspondences Mengya Liu, Ajad Chhatkuli, Janis Postels, Luc Van Gool, Federico Tombari
PDF
Self-Supervised Underwater Caustics Removal and Descattering via Deep Monocular SLAM Jonathan Sauder, Devis Tuia
PDF
Self-Supervised Video Copy Localization with Regional Token Representation Minlong Lu, Yichen Lu, Siwei Nie, Xudong Yang, Xiaobo Zhang
PDF
Self-Supervised Video Desmoking for Laparoscopic Surgery Renlong Wu, Zhilu Zhang, Shuohao Zhang, Longfei Gou, Haobin Chen, Lei Zhang, Hao Chen, Wangmeng Zuo
PDF
Self-Supervised Visual Learning from Interactions with Objects Arthur Aubret, Céline Teulière, Jochen Triesch
PDF
Self-Training Room Layout via Geometry-Aware Ray-Casting Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang, Jonathan Lee, Yi-Hsuan Tsai, Min Sun
PDF
SelfGeo: Self-Supervised and Geodesic-Consistent Estimation of Keypoints on Deformable Shapes Mohammad Zohaib, Luca Cosmo, Alessio Del Bue
PDF
SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder Jaeseong Lee, Junha Hyung, Sohyun Jeong, Jaegul Choo
PDF
Semantic Diversity-Aware Prototype-Based Learning for Unbiased Scene Graph Generation Jaehyeong Jeon, Kibum Kim, Kanghoon Yoon, Chanyoung Park
PDF
Semantic Residual Prompts for Continual Learning Martin Menabue, Emanuele Frascaroli, Matteo Boschini, Enver Sangineto, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
PDF
Semantic-Guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift Kangyu Xiao, Zilei Wang, Junjie Li
PDF
Semantically Guided Representation Learning for Action Anticipation Anxhelo Diko, Danilo Avola, Bardh Prenkaj, Federico Fontana, Luigi Cinque
PDF
SemanticHuman-HD: High Resolution Semantic Disentangled 3D Human Generation Peng Zheng, Tao Liu, Zili Yi, Rui Ma
PDF
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization Kailin Li, Jingbo Wang, Lixin Yang, Cewu Lu, Bo Dai
PDF
Semi-Supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency Meilong Xu, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen
PDF
Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment Wulian Yun, Mengshi Qi, Fei Peng, Huadong Ma
PDF
Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization Hongtao Wu, Angelica I Aviles-Rivero, Yijun Yang, Jingjing Ren, Sixiang Chen, Haoyu Chen, Lei Zhu
PDF
Semicalibrated Relative Pose from an Affine Correspondence and Monodepth Petr Hruby, Marc Pollefeys, Daniel Barath
PDF
SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc Van Gool, Federico Tombari
PDF
SemReg: Semantics Constrained Point Cloud Registration Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni, Wei Pan, Xiao Liu, Hongdong Li
PDF
SemTrack: A Large-Scale Dataset for Semantic Tracking in the Wild Pengfei Wang, Xiaofei Hui, Jing Wu, Zile Yang, Kian Eng Ong, Xinge Zhao, Beijia Lu, Dezhao Huang, Evan Ling, Weiling Chen, Keng Teck Ma, Minhoe Hur, Jun Liu
PDF
SENC: Handling Self-Collision in Neural Cloth Simulation Zhouyingcheng Liao, Sinan Wang, Taku Komura
PDF
Sequential Representation Learning via Static-Dynamic Conditional Disentanglement Mathieu Cyrille Simon, Pascal Frossard, Christophe De Vleeschouwer
PDF
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds Yanbo Wang, Wentao Zhao, Cao Chuan, Tianchen Deng, Jingchuan Wang, Weidong Chen
PDF
SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization Yiyang Chen, Siyan Dong, Xulong Wang, Lulu Cai, Youyi Zheng, Yanchao Yang
PDF
SGS-SLAM: Semantic Gaussian Splatting for Neural Dense SLAM Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, Hongyu Wang
PDF
Shape from Heat Conduction Sriram Narayanan, Mani Ramanagopal, Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
PDF
Shape-Guided Configuration-Aware Learning for Endoscopic-Image-Based Pose Estimation of Flexible Robotic Instruments Yiyao Ma, Kai Chen, Hon-Sing Tong, Ruofeng Wei, Yui-Lun Ng, Ka-Wai Kwok, Qi Dou
PDF
Shape2Scene: 3D Scene Representation Learning Through Pre-Training on Shape Data Tuo Feng, Wenguan Wang, Ruijie Quan, Yi Yang
PDF
Shapefusion: 3D Localized Human Diffusion Models Rolandos Alexandros Potamias, Michael Tarasiou, Stylianos Ploumpis, Stefanos Zafeiriou
PDF
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma
PDF
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin
PDF
Shedding More Light on Robust Classifiers Under the Lens of Energy-Based Models Mujtaba Hussain Mirza, Maria Rosaria Briglia, Senad Beadini, Iacopo Masi
PDF
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning Haiwen Diao, Bo Wan, Xu Jia, Yunzhi Zhuge, Ying Zhang, Huchuan Lu, Long Chen
PDF
SHIC: Shape-Image Correspondences with No Keypoint Supervision Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
PDF
Shifted Autoencoders for Point Annotation Restoration in Object Counting Yuda Zou, Xin Xiao, Peilin Zhou, Zhichao Sun, Bo Du, Yongchao Xu
PDF
SHINE: Saliency-Aware HIerarchical NEgative Ranking for Compositional Temporal Grounding Zixu Cheng, Yujiang Pu, Shaogang Gong, Parisa Kordjamshidi, Yu Kong
PDF
ShoeModel: Learning to Wear on the User-Specified Shoes via Diffusion Model Wenyu Li, Binghui Chen, Yifeng Geng, Xuansong Xie, Wangmeng Zuo
PDF
Siamese Vision Transformers Are Scalable Audio-Visual Learners Yan-Bo Lin, Gedas Bertasius
PDF
SIGMA: Sinkhorn-Guided Masked Video Modeling Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker, Efstratios Gavves, Cees Snoek, Yuki M Asano
PDF
SignAvatars: A Large-Scale 3D Sign Language Holistic Motion Dataset and Benchmark Zhengdi Yu, Shaoli Huang, Yongkang Cheng, Tolga Birdal
PDF
SignGen: End-to-End Sign Language Video Generation with Latent Diffusion Fan Qi, Yu Duan, Changsheng Xu, Huaiwen Zhang
PDF
SILC: Improving Vision Language Pretraining with Self-Distillation Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc Van Gool, Federico Tombari
PDF
SIMBA: Split Inference - Mechanisms, Benchmarks and Attacks Abhishek Singh, Vivek Sharma, Rohan Sukumaran, John J Mose, Jeffrey K Chiu, Justin Yu, Ramesh Raskar
PDF
Similarity of Neural Architectures Using Adversarial Attack Transferability Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, Sanghyuk Chun, Jong-Seok Lee
PDF
SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras Yingqi Tang, Zhaotie Meng, Guoliang Chen, Erkang Cheng
PDF
Simple Unsupervised Knowledge Distillation with Space Similarity Aditya Singh, Haohan Wang
PDF
Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights Yan Hao, Florent Forest, Olga Fink
PDF
SINDER: Repairing the Singular Defects of DINOv2 Haoqi Wang, Tong Zhang, Mathieu Salzmann
PDF
Single-Mask Inpainting for Voxel-Based Neural Radiance Fields Jiafu Chen, Tianyi Chu, Jiakai Sun, Wei Xing, Lei Zhao
PDF
Single-Photon 3D Imaging with Equi-Depth Photon Histograms Kaustubh Sadekar, David Maier, Atul Ingle
PDF
SiT: Exploring Flow and Diffusion-Based Generative Models with Scalable Interpolant Transformers Nanye Ma, Mark Goldstein, Michael Albergo, Nicholas M Boffi, Eric Vanden-Eijnden, Saining Xie
PDF
Situated Instruction Following So Yeon Min, Xavier Puig, Devendra Singh Chaplot, Tsung-Yen Yang, Priyam Parashar, Akshara Rai, Ruslan Salakhutdinov, Yonatan Bisk, Roozbeh Mottaghi
PDF
Six-Point Method for Multi-Camera Systems with Reduced Solution Space Banglei Guan, Ji Zhao, Laurent Kneip
PDF
SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition Jeonghyeok Do, Munchurl Kim
PDF
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures Yannick Kirchhoff, Maximilian R Rokuss, Saikat Roy, Balint Kovacs, Constantin Ulrich, Tassilo Wald, Maximilian Zenk, Philipp Vollmuth, Jens Kleesiek, Fabian Isensee, Klaus H. Maier-Hein
PDF
Skeleton-Based Group Activity Recognition via Spatial-Temporal Panoramic Graph Zhengcen Li, Xinle Chang, Yueran Li, Jingyong Su
PDF
Sketch2Vox: Learning 3D Reconstruction from a Single Monocular Sketch Image Fei Wang
PDF
Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation Yingshan Chang, Yasi Zhang, Zhiyuan Fang, Ying Nian Wu, Yonatan Bisk, Feng Gao
PDF
SkyMask: Attack-Agnostic Robust Federated Learning with Fine-Grained Learnable Masks Peishen Yan, Hao Wang, Tao Song, Yang Hua, Ruhui Ma, Ningxin Hu, Mohammad Reza Haghighat, Haibing Guan
PDF
SkyScenes: A Synthetic Dataset for Aerial Scene Understanding Sahil S Khose, Anisha Pal, Aayushi Agarwal, Deepanshi, Judy Hoffman, Prithvijit Chattopadhyay
PDF
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking Siyuan Li, Lei Ke, Yung-Hsu Yang, Luigi Piccinelli, Mattia Segù, Martin Danelljan, Luc Van Gool
PDF
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic Kashyap Chitta, Daniel Dauner, Andreas Geiger
PDF
SLIM: Spuriousness Mitigation with Minimal Human Annotations Xiwei Xuan, Ziquan Deng, Hsuan-Tien Lin, Kwan-Liu Ma
PDF
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow Yuanzhi Zhu, Xingchao Liu, Qiang Liu
PDF
SlotLifter: Slot-Guided Feature Lifting for Learning Object-Centric Radiance Fields Yu Liu, Baoxiong Jia, Yixin Chen, Siyuan Huang
PDF
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions Xiaoyu Liu, Yuxiang Wei, Ming Liu, Xianhui Lin, Peiran Ren, Xuansong Xie, Wangmeng Zuo
PDF
SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution Mingjun Zheng, Long Sun, Jiangxin Dong, Jinshan Pan
PDF
SMILe: Leveraging Submodular Mutual Information for Robust Few-Shot Object Detection Anay Majee, Ryan X Sharp, Rishabh Iyer
PDF
SMooDi: Stylized Motion Diffusion Model Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang
PDF
Smoothness, Synthesis, and Sampling: Re-Thinking Unsupervised Multi-View Stereo with DIV Loss Alex Rich, Noah Stier, Pradeep Sen, Tobias Hollerer
PDF
SNeRV: Spectra-Preserving Neural Representation for Video Jina Kim, Jihoo Lee, Jewon Kang
PDF
SNP: Structured Neuron-Level Pruning to Preserve Attention Scores KyungHwan Shim, Jaewoong Yun, Shinkook Choi
PDF
Snuffy: Efficient Whole Slide Image Classifier Hossein Jafarinia, Alireza Alipanah, Saeed Razavi, Nahal Mirzaie, Mohammad Hossein Rohban
PDF
Soft Prompt Generation for Domain Generalization Shuanghao Bai, Yuedi Zhang, Wanqi Zhou, Zhirong Luan, Badong Chen
PDF
Soft Shadow Diffusion (SSD): Physics-Inspired Learning for 3D Computational Periscopy Fadlullah A Raji, John Murray-Bruce
PDF
Solving Motion Planning Tasks with a Scalable Generative Model Yihan Hu, Siqi Chai, Zhening Yang, Jingyu Qian, Kun Li, Wenxin Shao, Haichao Zhang, Wei Xu, Qiang Liu
PDF
Solving the Inverse Problem of Microscopy Deconvolution with a Residual Beylkin-Coifman-Rokhlin Neural Network Rui Li, Mikhail Kudryashev, Artur Yakimovich
PDF
SOS: Segment Object System for Open-World Instance Segmentation with Object Priors Christian Wilms, Tim Rolff, Maris N Hillemann, Robert Johanson, Simone Frintrop
PDF
Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models Ruibin Li, Ruihuang Li, Song Guo, Lei Zhang
PDF
Source-Free Domain-Invariant Performance Prediction Ekaterina Khramtsova, Mahsa Baktashmotlagh, Guido Zuccon, Xi Wang, Mathieu Salzmann
PDF
SpaceJAM: A Lightweight and Regularization-Free Method for Fast Joint Alignment of Images Nir Barel, Ron A Shapira Weber, Nir Mualem, Shahaf E Finder, Oren Freifeld
PDF
SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow Orcun Cetintas, Tim Meinhardt, Guillem Brasó, Laura Leal-Taixé
PDF
SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville
PDF
SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views Chao Xu, Ang Li, Linghao Chen, Yulin Liu, Ruoxi Shi, Hao Su, Minghua Liu
PDF
Sparse Beats Dense: Rethinking Supervision in Radar-Camera Depth Completion Huadong Li, Minhao Jing, Jin Wang, Shichao Dong, Jiajun Liang, Haoqiang Fan, Renhe Ji
PDF
Sparse Refinement for Efficient High-Resolution Semantic Segmentation Zhijian Liu, Zhuoyang Zhang, Samir Khaki, Shang Yang, Haotian Tang, Chenfeng Xu, Kurt Keutzer, Song Han
PDF
SparseCraft: Few-Shot Neural Reconstruction Through Stereopsis Guided Geometric Linearization Mae Younes, Amine Ouasfi, Adnane Boukhayma
PDF
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai
PDF
SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection Hongcheng Zhang, Liu Liang, Pengxin Zeng, Xiao Song, Zhe Wang
PDF
SparseRadNet: Sparse Perception Neural Network on Subsampled Radar Data Jialong Wu, Mirko Meuter, Markus Schoeler, Matthias Rottmann
PDF
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images Jintu Zheng, Yi Ding, Qizhe Liu, Yuehui Chen, Yi Cao, Ying Hu, Zenan Wang
PDF
Spatial-Temporal Multi-Level Association for Video Object Segmentation Deshui Miao, Xin Li, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang
PDF
SpatialFormer: Towards Generalizable Vision Transformers with Explicit Spatial Understanding Han Xiao, Wenzhao Zheng, Sicheng Zuo, Peng Gao, Jie Zhou, Jiwen Lu
PDF
Spatially-Variant Degradation Model for Dataset-Free Super-Resolution Shaojie Guo, Haofei Song, Qingli Li, Yan Wang
PDF
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kim
PDF
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization Xixu Hu, Runkai Zheng, Jindong Wang, Cheuk Hang Leung, Qi Wu, Xing Xie
PDF
Spectral Subsurface Scattering for Material Classification Haejoon Lee, Aswin Sankaranarayanan
PDF
SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models Weilong Chai, Dandan Zheng, Jiajiong Cao, Zhiquan Chen, Changbao Wang, Chenguang Ma
PDF
SphereHead: Stable 3D Full-Head Synthesis with Spherical Tri-Plane Representation Heyuan Li, Ce Chen, Tianhao Shi, Yuda Qiu, Sizhe An, Guanying Chen, Xiaoguang Han
PDF
Spherical Linear Interpolation and Text-Anchoring for Zero-Shot Composed Image Retrieval Young Kyun Jang, Dat B Huynh, Ashish Shah, Wen-Kai Chen, Ser-Nam Lim
PDF
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos Heeseung Yun, Ruohan Gao, Ishwarya Ananthabhotla, Anurag Kumar, Jacob Donley, Chao Li, Gunhee Kim, Vamsi Krishna Ithapu, Calvin Murdock
PDF
SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-Modal Large Language Models Ziyi Lin, Dongyang Liu, Renrui Zhang, Peng Gao, Longtian Qiu, Han Xiao, Han Qiu, Wenqi Shao, Keqin Chen, Jiaming Han, Siyuan Huang, Yichi Zhang, Xuming He, Yu Qiao, Hongsheng Li
PDF
Spike-Temporal Latent Representation for Energy-Efficient Event-to-Video Reconstruction Jianxiong Tang, Jian-Huang Lai, Lingxiao Yang, Xiaohua Xie
PDF
Spiking Wavelet Transformer Yuetong Fang, Ziqing Wang, Lingfeng Zhang, Jiahang Cao, Honglei Chen, Renjing Xu
PDF
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images Josh David Myers-Dean, Jarek T Reynolds, Brian Price, Yifei Fan, Danna Gurari
PDF
SPIRE: Semantic Prompt-Driven Image Restoration Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi
PDF
SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction Marko Mihajlovic, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, Edmond Boyer
PDF
Spline-Based Transformers Prashanth Chandran, Agon Serifi, Markus Gross, Moritz Bächer
PDF
SPVLoc: Semantic Panoramic Viewport Matching for 6d Camera Localization in Unseen Environments Niklas Gard, Anna Hilsmann, Peter Eisert
PDF
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant Guohao Sun, Can Qin, Jiaminan Wang, Zeyuan Chen, Ran Xu, Zhiqiang Tao
PDF
SRPose: Two-View Relative Pose Estimation with Sparse Keypoints Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia
PDF
SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning Mengxin Zheng, Jiaqi Xue, Zihao Wang, Xun Chen, Qian Lou, Lei Jiang, Xiaofeng Wang
PDF
ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shu
PDF
ST-LLM: Large Language Models Are Effective Temporal Learners Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li
PDF
Stable Preference: Redefining Training Paradigm of Human Preference Model for Text-to-Image Synthesis Hanting Li, Hongjing Niu, Feng Zhao
PDF
Stable Video Portraits Mirela Ostrek, Justus Thies
PDF
StableDrag: Stable Dragging for Point-Based Image Editing Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang
PDF
STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians Yifei Zeng, Yanqin Jiang, Siyu Zhu, Yuanxun Lu, Youtian Lin, Hao Zhu, Weiming Hu, Xun Cao, Yao Yao
PDF
STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay Yu Yongcan, Lijun Sheng, Ran He, Jian Liang
PDF
Statewide Visual Geolocalization in the Wild Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
PDF
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation Juncheng Ma, Peiwen Sun, Yaoting Wang, Di Hu
PDF
Stepwise Multi-Grained Boundary Detector for Point-Supervised Temporal Action Localization Mengnan Liu, Le Wang, Sanping Zhou, Kun Xia, Qi Wu, Qilin Zhang, Gang Hua
PDF
StereoGlue: Joint Feature Matching and Robust Estimation Daniel Barath, Dmytro Mishkin, Luca Cavalli, Paul-Edouard Sarlin, Petr Hruby, Marc Pollefeys
PDF
Stitched ViTs Are Flexible Vision Backbones Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang
PDF
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion Ming Tao, Bingkun Bao, Hao Tang, Yaowei Wang, Changsheng Xu
PDF
Straightforward Layer-Wise Pruning for More Efficient Visual Adaptation Ruizi Han, Jinglei Tang
PDF
Stream Query Denoising for Vectorized HD-mAP Construction Shuo Wang, Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao
PDF
Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng
PDF
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang
PDF
Strike a Balance in Continual Panoptic Segmentation Jinpeng Chen, Runmin Cong, Yuxuan Luo, Horace Ho Shing Ip, Sam Kwong
PDF
Stripe Observation Guided Inference Cost-Free Attention Mechanism Zhongzhan Huang, Shanshan Zhong, Wushao Wen, Jinghui Qin, Liang Lin
PDF
StructLDM: Structured Latent Diffusion for 3D Human Generation Tao Hu, Fangzhou Hong, Ziwei Liu
PDF
Structured-NeRF: Hierarchical Scene Graph with Neural Representation Zhide Zhong, Jiakai Cao, Songen Gu, Sirui Xie, Liyi Luo, Hao Zhao, Guyue Zhou, Haoang Li, Zike Yan
PDF
STSP: Spatial-Temporal Subspace Projection for Video Class-Incremental Learning Hao Cheng, Siyuan Yang, Chong Wang, Joey Tianyi Zhou, Alex Kot, Bihan Wen
PDF
Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation Mathias Öttl, Frauke Wilm, Jana Steenpass, Jingna Qiu, Matthias Rübner, Prof Arndt Hartmann, Matthias W. Beckmann, Peter Fasching, Andreas K Maier, Ramona Erber, Bernhard Kainz, Katharina Breininger
PDF
StyleCity: Large-Scale 3D Urban Scenes Stylization Yingshu Chen, Huajian Huang, Tuan-Anh Vu, Ka Chun Shum, Sai-Kit Yeung
PDF
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models Wen Li, Muyuan Fang, Cheng Zou, Biao Gong, Ruobing Zheng, Meng Wang, Jingdong Chen, Ming Yang
PDF
Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation Jiawei Han, Kaiqi Liu, Wei Li, Guangzhi Chen
PDF
SUMix: Mixup with Semantic and Uncertain Information Huafeng Qin, Xin Jin, Hongyu Zhu, Hongchao Liao, Mounim A. El Yacoubi, Xinbo Gao
PDF
SUP-NeRF: A Streamlined Unification of Pose Estimation and NeRF for Monocular 3D Object Reconstruction Yuliang Guo, Abhinav Kumar, Cheng Zhao, Ruoyu Wang, Xinyu Huang, Liu Ren
PDF
SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference Alind Khare, Animesh Agrawal, Aditya Annavajjala, Payman Behnam, Myungjin Lee, Hugo M Latapie, Alexey Tumanov
PDF
SuperGaussian: Repurposing Video Models for 3D Super Resolution Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Fruehstueck
PDF
Superpixel-Informed Implicit Neural Representation for Multi-Dimensional Data Jia-Yi Li, Xi-Le Zhao, Jian-Li Wang, Chao Wang, Min Wang
PDF
Sur^2f: A Hybrid Representation for High-Quality and Efficient Surface Reconstruction from Multi-View Images Zhangjin Huang, Zhihao Liang, Kui Jia
PDF
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models Zhengming Yu, Zhiyang Dou, Xiaoxiao Long, Cheng Lin, Zekun Li, Yuan Liu, Norman Müller, Taku Komura, Marc Habermann, Christian Theobalt, Xin Li, Wenping Wang
PDF
Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints Qianyi Wu, Jianmin Zheng, Jianfei Cai
PDF
Surface-Centric Modeling for High-Fidelity Generalizable Neural Surface Reconstruction Rui Peng, Shihe Shen, Kaiqiang Xiong, Huachen Gao, Jianbo Jiao, Xiaodong Gu, Ronggang Wang
PDF
SV3D: Novel Multi-View Synthesis and 3D Generation from a Single Image Using Latent Video Diffusion Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitrii Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani
PDF
SWAG: Splatting in the Wild Images with Appearance-Conditioned Gaussians Hiba Dahmani, Moussab Bennehar, Nathan Piasco, Luis G Roldao Jimenez, Dzmitry Tsishkou
PDF
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing Jing Gu, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Yilin Wang, Xin Eric Wang
PDF
SweepNet: Unsupervised Learning Shape Abstraction via Neural Sweepers Mingrui Zhao, Yizhi Wang, Fenggen Yu, Changqing Zou, Ali Mahdavi-Amiri
PDF
SwiftBrush V2: Make Your One-Step Diffusion Model Better than Its Teacher Trung Tuan Dao, Thuan Hoang Nguyen, Thanh Van Le, Duc H Vu, Khoi Nguyen, Cuong Pham, Anh T Tran
PDF
SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting Richard Shaw, Michal Nazarczuk, Jifei Song, Arthur Moreau, Sibi Catley-Chandar, Helisa Dhamo, Eduardo Pérez Pellitero
PDF
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts Byeongjun Park, Hyojun Go, Jin-Young Kim, Sangmin Woo, Seokil Ham, Changick Kim
PDF
Syn-to-Real Domain Adaptation for Point Cloud Completion via Part-Based Approach Yunseo Yang, Jihun Kim, Kuk-Jin Yoon
PDF
Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets Ishan Rajendrakumar Dave, Fabian Caba, Mubarak Shah, Simon Jenni
PDF
Synchronization Is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella
PDF
Synchronization of Projective Transformations Rakshith Madhavan, Andrea Fusiello, Federica Arrigoni
PDF
Synchronous Diffusion for Unsupervised Smooth Non-Rigid 3D Shape Matching Dongliang Cao, Zorah Laehner, Florian Bernard
PDF
Synergy of Sight and Semantics: Visual Intention Understanding with CLIP Qu Yang, Mang Ye, Dacheng Tao
PDF
Synthesizing Environment-Specific People in Photographs Mirela Ostrek, Carol O'Sullivan, Michael J. Black, Justus Thies
PDF
Synthesizing Time-Varying BRDFs via Latent Space Takuto Narumoto, Hiroaki Santo, Fumio Okura
PDF
T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy Fan Duan, Jiahao Yu, Li Chen
PDF
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning Weijie Wei, Fatemeh Karimi Nejadasl, Theo Gevers, Martin R. Oswald
PDF
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy Qing Jiang, Feng Li, Zhaoyang Zeng, Shilong Liu, Tianhe Ren, Lei Zhang
PDF
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen
PDF
Tackling Structural Hallucination in Image Translation with Local Diffusion Seunghoi Kim, Chen Jin, Tom Diethe, Matteo Figini, Henry FJ Tregidgo, Asher Mullokandov, Philip A Teare, Daniel Alexander
PDF
TAG: Text Prompt Augmentation for Zero-Shot Out-of-Distribution Detection Xixi Liu, Christopher Zach
PDF
Take a Step Back: Rethinking the Two Stages in Visual Reasoning Mingyu Zhang, Jiting Cai, Mingyu Liu, Yue Xu, Cewu Lu, Yong-Lu Li
PDF
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, Lin Gu
PDF
Taming CLIP for Fine-Grained and Structured Visual Understanding of Museum Exhibits Ada-Astrid Balauca, Danda Pani Paudel, Kristina Toutanova, Luc Van Gool
PDF
Taming Latent Diffusion Model for Neural Radiance Field Inpainting Chieh Hubert Lin, Changil Kim, Jia-Bin Huang, Qinbo Li, Chih-Yao Ma, Johannes Kopf, Ming-Hsuan Yang, Hung-Yu Tseng
PDF
Taming Lookup Tables for Efficient Image Retouching Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang
PDF
TAPTR: Tracking Any Point with Transformers as Detection Hongyang Li, Hao Zhang, Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Lei Zhang
PDF
Task-Driven Uncertainty Quantification in Inverse Problems via Conformal Prediction Jeffrey Wen, Rizwan Ahmad, Phillip Schniter
PDF
TC4D: Trajectory-Conditioned Text-to-4D Generation Sherwin Bahmani, Xian Liu, Wang Yifan, Ivan Skorokhodov, Victor Rong, Ziwei Liu, Xihui Liu, Jeong Joon Park, Sergey Tulyakov, Gordon Wetzstein, Andrea Tagliasacchi, David B Lindell
PDF
TCAN: Animating Human Images with Temporally Consistent Pose Guidance Using Diffusion Models Jeongho Kim, Min-Jung Kim, Junsoo Lee, Jaegul Choo
PDF
TCC-Det: Temporarily Consistent Cues for Weakly-Supervised 3D Detection Jan Skvrna, Lukáš Neumann
PDF
TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving Cheng Zhao, Su Sun, Ruoyu Wang, Yuliang Guo, Jun-Jun Wan, Zhou Huang, Xinyu Huang, Yingjie Victor Chen, Liu Ren
PDF
Teach CLIP to Develop a Number Sense for Ordinal Regression Yao Du, Qiang Zhai, Weihang Dai, Xiaomeng Li
PDF
Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint Sixiang Chen, Tian Ye, Kai Zhang, Zhaohu Xing, Yunlong Lin, Lei Zhu
PDF
Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching Ruonan Yu, Songhua Liu, Jingwen Ye, Xinchao Wang
PDF
Temporal as a Plugin: Unsupervised Video Denoising with Pre-Trained Image Denoisers Zixuan Fu, Lanqing Guo, Chong Wang, Yufei Wang, Zhihao Li, Bihan Wen
PDF
Temporal Event Stereo via Joint Learning with Stereoscopic Flow Hoonhee Cho, Jae-Young Kang, Kuk-Jin Yoon
PDF
Temporal Residual Guided Diffusion Framework for Event-Driven Video Reconstruction Lin Zhu, Yunlong Zheng, Yijun Zhang, Xiao Wang, Lizhi Wang, Hua Huang
PDF
Temporal Residual Jacobians for Rig-Free Motion Transfer Sanjeev Muralikrishnan, Niladri Shekhar Dutt, Siddhartha Chaudhuri, Noam Aigerman, Vladimir Kim, Matthew Fisher, Niloy Mitra
PDF
Temporal-Mapping Photography for Event Cameras Yuhan Bao, Lei Sun, Yuqin Ma, Kaiwei Wang
PDF
Temporally Consistent Stereo Matching Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia
PDF
Tendency-Driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen
PDF
Tensorial Template Matching for Fast Cross-Correlation with Rotations and Its Application for Tomography Antonio Martinez-Sanchez, Ulrike Homberg, J. M. Almira, Harold Phelippeau
PDF
Test-Time Model Adaptation for Image Reconstruction Using Self-Supervised Adaptive Layers Yutian Zhao, Tianjing Zhang, Hui Ji
PDF
Test-Time Stain Adaptation with Diffusion Models for Histopathology Image Classification Cheng-Chang Tsai, Yuan-Chih Chen, Chun-Shien Lu
PDF
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation Nikolai Kalischek, Torben Peters, Jan Dirk Wegner, Konrad Schindler
PDF
TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation Yufei Liu, Junwei Zhu, Junshu Tang, Shijie Zhang, Jiangning Zhang, Weijian Cao, Chengjie Wang, Yunsheng Wu, Dongjin Huang
PDF
TexGen: Text-Guided 3D Texture Generation with Multi-View Sampling and Resampling Dong Huo, Zixin Guo, Xinxin Zuo, Zhihao Shi, Juwei Lu, Peng Dai, Songcen Xu, Li Cheng, Yee-Hong Yang
PDF
Text Motion Translator: A Bi-Directional Model for Enhanced 3D Human Motion Generation from Open-Vocabulary Descriptions Yijun Qian, Jack Urbanek, Alexander Hauptmann, Jungdam Won
PDF
Text to Layer-Wise 3D Clothed Human Generation Junting Dong, Qi Fang, Zehuan Huang, Xudong Xu, Jingbo Wang, Sida Peng, Bo Dai
PDF
Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models Luozhou Wang, Guibao Shen, Wenhang Ge, Guangyong Chen, Yijun Li, Yingcong Chen
PDF
Text-Conditioned Resampler for Long Form Video Understanding Bruno Korbar, Yongqin Xian, Alessio Tonioni, Andrew Zisserman, Federico Tombari
PDF
Text-Guided Video Masked Autoencoder David Fan, Jue Wang, Shuai Liao, Zhikang Zhang, Vimal Bhat, Xinyu Li
PDF
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy L Bearman, Dhruv Mahajan
PDF
Text2LiDAR: Text-Guided LiDAR Point Clouds Generation via Equirectangular Transformer Yang Wu, Kaihua Zhang, Jianjun Qian, Jin Xie, Jian Yang
PDF
Text2Place: Affordance-Aware Text Guided Human Placement Rishubh Parihar, Harsh Gupta, Sachidanand Vs, Venkatesh Babu Radhakrishnan
PDF
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering Jingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei
PDF
Textual Grounding for Open-Vocabulary Visual Information Extraction in Layout-Diversified Documents Mengjun Cheng, Chengquan Zhang, Chang Liu, Yuke Li, Bohan Li, Kun Yao, Xiawu Zheng, Rongrong Ji, Jie Chen
PDF
Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery Haiyang Zheng, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhong
PDF
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation Byeonghyun Pak, Byeongju Woo, Sunghwan Kim, Dae-hwan Kim, Hoseong Kim
PDF
Textual-Visual Logic Challenge: Understanding and Reasoning in Text-to-Image Generation Peixi Xiong, Michael A Kozuch, Nilesh Jain
PDF
Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing Tianxing Xu, Wenbo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhang
PDF
TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-Spoofing Xudong Wang, Ke-Yue Zhang, Taiping Yao, Qianyu Zhou, Shouhong Ding, Pingyang Dai, Rongrong Ji
PDF
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World Weiyun Wang, Yiming Ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, Yu Qiao, Jifeng Dai
PDF
The Devil Is in the Statistics: Mitigating and Exploiting Statistics Difference for Generalizable Semi-Supervised Medical Image Segmentation Muyang Qiu, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao
PDF
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin, Hongxia Xie, Terence Lin, Yi-Ning Huang, Hong-Han Shuai, Wen-Huang Cheng
PDF
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? Qinyu Zhao, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould
PDF
The Gaussian Discriminant Variational Autoencoder (GdVAE): A Self-Explainable Model with Counterfactual Explanations Anselm Haselhoff, Kevin Trelenberg, Fabian Küppers, Jonas Schneider
PDF
The Hard Positive Truth About Vision-Language Compositionality Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang, Ranjay Krishna
PDF
The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa
PDF
The Nerfect Match: Exploring NeRF Features for Visual Localization Qunjie Zhou, Maxim Maximov, Or Litany, Laura Leal-Taixé
PDF
The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers Seungwoo Son, Jegwang Ryu, Namhoon Lee, Jaeho Lee
PDF
The Sky's the Limit: Relightable Outdoor Scenes via a Sky-Pixel Constrained Illumination Prior and Outside-in Visibility James A D Gardner, Evgenii Kashin, Bernhard Egger, William Smith
PDF
Thermal3D-GS: Physics-Induced 3D Gaussians for Thermal Infrared Novel-View Synthesis Qian Chen, Shihao Shu, Xiangzhi Bai
PDF
Think Before Placement: Common Sense Enhanced Transformer for Object Placement Yaxuan Qin, Jiayu Xu, Ruiping Wang, Xilin Chen
PDF
Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-V2) Qifeng Li, Xiaosong Jia, Shaobo Wang, Junchi Yan
PDF
Thinking Outside the BBox: Unconstrained Generative Object Compositing Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang, Jianming Zhang, Yizhi Song, Dan Ruta, Andrew Gilbert, John Collomosse, Soo Ye Kim
PDF
This Probably Looks Exactly like That: An Invertible Prototypical Network Zachariah Carmichael, Timothy P Redgrave, Daniel Gonzalez Cedre, Walter Scheirer
PDF
Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks Manyuan Zhang, Guanglu Song, Xiaoyu Shi, Yu Liu, Hongsheng Li
PDF
TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models Aditya Chinchure, Pushkar Shukla, Gaurav Bhatt, Kiri Salij, Kartik Hosanagar, Leonid Sigal, Matthew Turk
PDF
Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba
PDF
Time-Efficient and Identity-Consistent Virtual Try-on Using a Variant of Altered Diffusion Models Phuong Hoang Dam, Jihoon Jeong, Anh T Tran, Daeyoung Kim
PDF
TimeCraft: Navigate Weakly-Supervised Temporal Grounded Video Question Answering via Bi-Directional Reasoning Huabin Liu, Xiao Ma, Cheng Zhong, Yang Zhang, Weiyao Lin
PDF
TimeLens-XL: Real-Time Event-Based Video Frame Interpolation with Large Motion Shi Guo, Yutian Chen, Tianfan Xue, Jinwei Gu, Yongrui Ma
PDF
Timestep-Aware Correction for Quantized Diffusion Models Yuzhe Yao, Feng Tian, Jun Chen, Haonan Lin, Guang Dai, Yong Liu, Jingdong Wang
PDF
Tiny Models Are the Computational Saver for Large Models Qingyuan Wang, Barry Cardiff, Antoine Frappé, Benoit Larras, Deepu John
PDF
TIP: Tabular-Image Pre-Training for Multimodal Classification with Incomplete Data Siyi Du, Shaoming Zheng, Yinsong Wang, Wenjia Bai, Declan P. O'Regan, Chen Qin
PDF
TLControl: Trajectory and Language Control for Human Motion Synthesis Weilin Wan, Zhiyang Dou, Taku Komura, Wenping Wang, Dinesh Jayaraman, Lingjie Liu
PDF
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy to Generate Unsafe Images ... for Now Yimeng Zhang, Jinghan Jia, Xin Chen, Aochuan Chen, Yihua Zhang, Jiancheng Liu, Ke Ding, Sijia Liu
PDF
To Supervise or Not to Supervise: Understanding and Addressing the Key Challenges of Point Cloud Transfer Learning Souhail Hadgi, Lei Li, Maks Ovsjanikov
PDF
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao
PDF
Token Compensator: Altering Inference Cost of Vision Transformer Without Re-Tuning Shibo Jie, Yehui Tang, Jianyuan Guo, Zhi-Hong Deng, Kai Han, Yunhe Wang
PDF
Tokenize Anything via Prompting Ting Pan, Lulu Tang, Xinlong Wang, Shiguang Shan
PDF
Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan
PDF
Topology-Preserving Downsampling of Binary Images Chia-Chia Chen, Chi-Han Peng
PDF
Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients Dohyung Kim, Junghyup Lee, Jeimin Jeon, Jaehyeon Moon, Bumsub Ham
PDF
Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning Yan Li, Weiwei Guo, Xue Yang, Ning Liao, Dunyun He, Jiaqi Zhou, Wenxian Yu
PDF
Toward Tiny and High-Quality Facial Makeup with Data Amplify Learning Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin, Ying Chen, Rui Shi, Yucheng Zheng, Yupeng Zhu, Bingbing Ni
PDF
Towards a Density Preserving Objective Function for Learning on Point Sets Haritha Jayasinghe, Ioannis Brilakis
PDF
Towards Adaptive Pseudo-Label Learning for Semi-Supervised Temporal Action Localization Feixiang Zhou, Bryan Williams, Hossein Rahmani
PDF
Towards Architecture-Agnostic Untrained Networks Priors for Image Reconstruction with Frequency Regularization Yilin Liu, Yunkui Pang, Jiang Li, Yong Chen, Pew-Thian Yap
PDF
Towards Certifiably Robust Face Recognition Seunghun Paik, Dongsoo Kim, Chanwoo Hwang, Sunpill Kim, Jae Hong Seo
PDF
Towards Compact Reversible Image Representations for Neural Style Transfer Xiyao Liu, Siyu Yang, Jian Zhang, Gerald Schaefer, Jiya Li, Xunli Fan, Songtao Wu, Hui Fang
PDF
Towards Dual Transparent Liquid Level Estimation in Biomedical Lab: Dataset, Methods and Practice Xiayu Wang, Ke Ma, Ruiyun Zhong, Xinggang Wang, Yi Fang, Yang Xiao, Tian Xia
PDF
Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation Rong Wang, Wei Mao, Changsheng Lu, Hongdong Li
PDF
Towards Image Ambient Lighting Normalization Florin-Alexandru Vasluianu, Tim Seizinger, Zongwei Wu, Rakesh Ranjan, Radu Timofte
PDF
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning Yibing Wei, Abhinav Gupta, Pedro Morgado
PDF
Towards Model-Agnostic Dataset Condensation by Heterogeneous Models Jun-Yeong Moon, Jung Uk Kim, Gyeong-Moon Park
PDF
Towards More Practical Group Activity Detection: A New Benchmark and Model Dongkeun Kim, Youngkil Song, Minsu Cho, Suha Kwak
PDF
Towards Multi-Modal Transformers in Federated Learning Guangyu Sun, Matias Mendieta, Aritra Dutta, Xin Li, Chen Chen
PDF
Towards Multimodal Open-Set Domain Generalization and Adaptation Through Self-Supervision Hao Dong, Eleni Chatzi, Olga Fink
PDF
Towards Multimodal Sentiment Analysis Debiasing via Bias Purification Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, Lihua Zhang
PDF
Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching Meng Chu, Zhedong Zheng, Wei Ji, Tingyu Wang, Tat-Seng Chua
PDF
Towards Neuro-Symbolic Video Understanding Minkyu Choi, Harsh Goel, Mohammad Omama, Yunhao Yang, Sahil Shah, Sandeep Chinchali
PDF
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions Mengyi Shan, Lu Dong, Yutao Han, Yuan Yao, Tao Liu, Ifeoma Nwogu, Guo-Jun Qi, Mitchell K Hill
PDF
Towards Open-Ended Visual Quality Comparison Haoning Wu, Hanwei Zhu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin
PDF
Towards Open-Ended Visual Recognition with Large Language Models Qihang Yu, Xiaohui Shen, Liang-Chieh Chen
PDF
Towards Open-World Object-Based Anomaly Detection via Self-Supervised Outlier Synthesis Brian Kostadinov Shalon Isaac-Medina, Yona Falinie Abdul Gaus, Neelanjan Bhowmik, Toby P Breckon
PDF
Towards Physical World Backdoor Attacks Against Skeleton Action Recognition Qichen Zheng, Yi Yu, Siyuan Yang, Jun Liu, Kwok-Yan Lam, Alex Kot
PDF
Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models Jiaqi Xu, Mengyang Wu, Xiaowei Hu, Chi-Wing Fu, Qi Dou, Pheng-Ann Heng
PDF
Towards Real-World Event-Guided Low-Light Video Enhancement and Deblurring Taewoo Kim, Jaeseok Jeong, Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon
PDF
Towards Reliable Advertising Image Generation Using Human Feedback Zhenbang Du, Wei Feng, Haohan Wang, Yaoyu Li, Jingsen Wang, Jian Li, Zheng Zhang, Jingjing Lv, Xin Zhu, Junsheng Jin, Junjie Shen, Zhangang Lin, Jingping Shao
PDF
Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models Francesco Croce, Naman D. Singh, Matthias Hein
PDF
Towards Robust Event-Based Networks for Nighttime via Unpaired Day-to-Night Event Translation Yuhwan Jeong, Hoonhee Cho, Kuk-Jin Yoon
PDF
Towards Robust Full Low-Bit Quantization of Super Resolution Networks Denis S. Makhov, Irina Zhelavskaya, Ruslan Ostapets, Dehua Song, Kirill Solodskikh
PDF
Towards Scene Graph Anticipation Rohith Peddi, Saksham Singh, Saurabh, Parag Singla, Vibhav Gogate
PDF
Towards Stable 3D Object Detection Jiabao Wang, Qiang Meng, Guochao Liu, Liujiang Yan, Ke Wang, Ming-Ming Cheng, Qibin Hou
PDF
Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-Spoofing Guanghao Zheng, Yuchen Liu, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong
PDF
TP2O: Creative Text Pair-to-Object Generation Using Balance Swap-Sampling Jun Li, Zedong Zhang, Jian Yang
PDF
TPA3D: Triplane Attention for Fast Text-to-3D Generation Bin-Shih Wu, Hong-En Chen, Sheng-Yu Huang, Yu-Chiang Frank Wang
PDF
Track Everything Everywhere Fast and Robustly Yunzhou Song, Jiahui Lei, Ziyun Wang, Lingjie Liu, Kostas Daniilidis
PDF
Track2Act: Predicting Point Tracks from Internet Videos Enables Generalizable Robot Manipulation Homanga Bharadhwaj, Roozbeh Mottaghi, Abhinav Gupta, Shubham Tulsiani
PDF
Trackastra: Transformer-Based Cell Tracking for Live-Cell Microscopy Benjamin Gallusser, Martin Weigert
PDF
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance Liting Lin, Heng Fan, Zhipeng Zhang, Yaowei Wang, Yong Xu, Haibin Ling
PDF
TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks Jinjie Mai, Wenxuan Zhu, Sara Rojas, Jesus Zarzar, Abdullah Hamdi, Guocheng Qian, Bing Li, Silvio Giancola, Bernard Ghanem
PDF
TrafficNight : An Aerial Multimodal Benchmark for Nighttime Vehicle Surveillance Guoxing Zhang, Yiming Liu, Xiaoyu Yang, Chao Huang, Huang Hailong
PDF
Train till You Drop: Towards Stable and Robust Source-Free Unsupervised 3D Domain Adaptation Björn Michele, Alexandre Boulch, Tuan-Hung Vu, Gilles Puy, Renaud Marlet, Nicolas Courty
PDF
Trainable Highly-Expressive Activation Functions Irit Chelly, Shahaf E. Finder, Shira Ifergane, Oren Freifeld
PDF
Training a Secure Model Against Data-Free Model Extraction Zhenyi Wang, Li Shen, Junfeng Guo, Tiehang Duan, Siyu Luan, Tongliang Liu, Mingchen Gao
PDF
Training a Small Emotional Vision Language Model for Visual Art Comprehension Jing Zhang, Liang Zheng, Meng Wang, Dan Guo
PDF
Training-Free Composite Scene Generation for Layout-to-Image Synthesis Jiaqi Liu, Tao Huang, Chang Xu
PDF
Training-Free Model Merging for Multi-Target Domain Adaptation Wenyi Li, Huan-ang Gao, Mingju Gao, Beiwen Tian, Rong Zhi, Hao Zhao
PDF
Training-Free Video Temporal Grounding Using Large-Scale Pre-Trained Models Minghang Zheng, Xinhao Cai, Qingchao Chen, Yuxin Peng, Yang Liu
PDF
Trajectory-Aligned Space-Time Tokens for Few-Shot Action Recognition Pulkit Kumar, Namitha Padmanabhan, Luke Luo, Sai Saketh Rambhatla, Abhinav Shrivastava
PDF
TrajPrompt: Aligning Color Trajectory with Vision-Language Representations Li-Wu Tsao, Hao-Tang Tsui, Yu-Rou Tuan, Pei-Chi Chen, Kuan-Lin Wang, Jhih-Ciang Wu, Hong-Han Shuai, Wen-Huang Cheng
PDF
TRAM: Global Trajectory and Motion of 3D Humans from In-the-Wild Videos Yufu Wang, Ziyun Wang, Lingjie Liu, Kostas Daniilidis
PDF
TransCAD: A Hierarchical Transformer for CAD Sequence Inference from Point Clouds Elona Dupont, Kseniya Cherenkova, Dimitrios Mallis, Gleb A Gusev, Anis Kacem, Djamila Aouada
PDF
Transferable 3D Adversarial Shape Completion Using Diffusion Models Xuelong Dai, Bin Xiao
PDF
TransFusion -- a Transparency-Based Diffusion Model for Anomaly Detection Matic Fučka, Vitjan Zavrtanik, Danijel Skočaj
PDF
Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors Jae Joong Lee, Bosheng Li, Sara M Beery, Jonathan Huang, Songlin Fei, Raymond A. Yeh, Bedrich Benes
PDF
TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly Mengqi Guo, Chen Li, Yuyang Zhao, Gim Hee Lee
PDF
Tri^2-Plane: Thinking Head Avatar via Feature Pyramid Luchuan Song, Pinxin Liu, Lele Chen, Guojun Yin, Chenliang Xu
PDF
TriNeRFLet: A Wavelet Based Triplane NeRF Representation Rajaei Khatib, Raja Giryes
PDF
TrojVLM: Backdoor Attack Against Vision Language Models Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
PDF
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias Sanghyun Jo, Soohyun Ryu, Sungyub Kim, Eunho Yang, Kyungsu Kim
PDF
TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts Youssef Mansour, Xuyang Zhong, Serdar Caglar, Reinhard Heckel
PDF
Tuning-Free Image Customization with Image and Text Guidance Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, Yong Liu, Jinlong Peng, Chengjie Wang, Feng Zheng
PDF
Turbo: Informativity-Driven Acceleration Plug-in for Vision-Language Large Models Chen Ju, Haicheng Wang, Haozhe Cheng, Xu Chen, Zhonghua Zhai, Weilin Huang, Jinsong Lan, Shuai Xiao, Bo Zheng
PDF
TurboEdit: Real-Time Text-Based Disentangled Real Image Editing Zongze Wu, Nicholas I Kolkin, Jonathan Brandt, Richard Zhang, Eli Shechtman
PDF
Two-Stage Active Learning for Efficient Temporal Action Segmentation Yuhao Su, Ehsan Elhamifar
PDF
Two-Stage Video Shadow Detection via Temporal-Spatial Adaption Xin Duan, Yu Cao, Lei Zhu, Gang Fu, Xin Wang, Renjie Zhang, Ping Li
PDF
U-COPE: Taking a Further Step to Universal 9d Category-Level Object Pose Estimation Li Zhang, Weiqing Meng, Yan Zhong, Bin Kong, Mingliang Xu, Jianming Du, Xue Wang, Rujing Wang, Liu Liu
PDF
UAV First-Person Viewers Are Radiance Field Learners Liqi Yan, Qifan Wang, Junhan Zhao, Qiang Guan, Zheng Tang, Jianhui Zhang, Dongfang Liu
PDF
uCAP: An Unsupervised Prompting Method for Vision-Language Models A. Tuan Nguyen, Kai Sheng Tai, Bor-Chun Chen, Satya Narayan Shukla, Hanchao Yu, Philip Torr, Tai-Peng Tian, Ser-Nam Lim
PDF
UCIP: A Universal Framework for Compressed Image Super-Resolution Using Dynamic Prompt Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen
PDF
UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework Tarun Kalluri, Sreyas Ravichandran, Manmohan Chandraker
PDF
UDiffText: A Unified Framework for High-Quality Text Synthesis in Arbitrary Images via Character-Aware Diffusion Models Yiming Zhao, Zhouhui Lian
PDF
UGG: Unified Generative Grasping Jiaxin Lu, Hao Kang, Haoxiang Li, Bo Liu, Yiding Yang, Qixing Huang, Gang Hua
PDF
UL-VIO: Ultra-Lightweight Visual-Inertial Odometry with Noise Robust Test-Time Adaptation Jinho Park, Se Young Chun, Mingoo Seok
PDF
UMBRAE: Unified Multimodal Brain Decoding Weihao Xia, Raoul de Charette, A. Cengiz Oztireli, Jing-Hao Xue
PDF
UMERegRobust – Universal Manifold Embedding Compatible Features for Robust Point Cloud Registration Yuval Haitman, Amit Efraim, Joseph M Francos
PDF
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang
PDF
Un-EVIMO: Unsupervised Event-Based Independent Motion Segmentation Ziyun Wang, Jinyuan Guo, Kostas Daniilidis
PDF
Uncertainty Calibration with Energy Based Instance-Wise Scaling in the Wild Dataset Mijoo Kim, Junseok Kwon
PDF
Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling Xuan Wu, Hongxiang Li, Yuanjiang Luo, Xuxin Cheng, Xianwei Zhuang, Meng Cao, Keren Fu
PDF
Uncertainty-Driven Spectral Compressive Imaging with Spatial-Frequency Transformer Lintao Peng, Siyu Xie, Liheng Bian
PDF
Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning Zijun Long, Lipeng Zhuang, George W Killick, Richard Mccreadie, Gerardo Aragon-Camarasa, Paul Henderson
PDF
Understanding Multi-Compositional Learning in Vision and Language Models via Category Theory Sotirios Panagiotis Chytas, Hyunwoo J Kim, Vikas Singh
PDF
Understanding Physical Dynamics with Counterfactual World Modeling Rahul Venkatesh, Honglin Chen, Kevin Feigelis, Daniel M Bear, Khaled Jedoui, Klemen Kotar, Felix J Binder, Wanhee Lee, Sherry Liu, Kevin Smith, Judith E. Fan, Daniel Yamins
PDF
Uni3DL: A Unified Model for 3D Vision-Language Understanding Xiang Li, Jian Ding, Zhaoyang Chen, Mohamed Elhoseiny
PDF
UNIC: Universal Classification Models via Multi-Teacher Distillation Yannis Kalantidis, Diane Larlus, Mert Bulent Sariyildiz, Philippe Weinzaepfel, Thomas Lucas
PDF
UniCal: Unified Neural Sensor Calibration Ze Yang, George G Chen, Haowei Zhang, Kevin Ta, Ioan Andrei Bârsan, Daniel Murphy, Sivabalan Manivasagam, Raquel Urtasun
PDF
UniCode : Learning a Unified Codebook for Multimodal Large Language Models Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu
PDF
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang
PDF
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation Hao Fang, Peng Wu, Yawei Li, Xinxin Zhang, Xiankai Lu
PDF
Unified Local-Cloud Decision-Making via Reinforcement Learning Kathakoli Sengupta, Zhongkai Shangguan, Sandesh Bharadwaj, Sanjay Arora, Eshed Ohn-Bar, Renato Mancuso
PDF
Unified Medical Image Pre-Training in Language-Guided Common Semantic Space Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu
PDF
UniFS: Universal Few-Shot Instance Perception with Point Representations Sheng Jin, Ruijie Yao, Lumin Xu, Wentao Liu, Chen Qian, Ji Wu, Ping Luo
PDF
Unifying 3D Vision-Language Understanding via Promptable Queries Ziyu Zhu, Zhuofan Zhang, Xiaojian Ma, Xuesong Niu, Yixin Chen, Baoxiong Jia, Zhidong Deng, Siyuan Huang, Qing Li
PDF
UniINR: Event-Guided Unified Rolling Shutter Correction, Deblurring, and Interpolation Yunfan Lu, Guoqiang Liang, Yusheng Wang, Lin Wang, Hui Xiong
PDF
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers Cong Wei, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen
PDF
UNIKD: UNcertainty-Filtered Incremental Knowledge Distillation for Neural Implicit Representation Mengqi Guo, Chen Li, Hanlin Chen, Gim Hee Lee
PDF
UniM2AE: Multi-Modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving Jian Zou, Tianyu Huang, Guanglei Yang, Zhenhua Guo, Tao Luo, Chun-Mei Feng, Wangmeng Zuo
PDF
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection Yingsen Zeng, Yujie Zhong, Chengjian Feng, Lin Ma
PDF
UniProcessor: A Text-Induced Unified Low-Level Image Processor Huiyu Duan, Xiongkuo Min, Sijing Wu, Wei Shen, Guangtao Zhai
PDF
UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang
PDF
UniTalker: Scaling up Audio-Driven 3D Facial Animation Through a Unified Model Xiangyu Fan, Jiaqi Li, Zhiqian Lin, Weiye Xiao, Lei Yang
PDF
UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud, Eloi Zablocki, Matthieu Cord, Alexandre Alahi
PDF
UniVoxel: Fast Inverse Rendering by Unified Voxelization of Scene Representation Shuang Wu, Songlin Tang, Guangming Lu, Jianzhuang Liu, Wenjie Pei
PDF
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning Jianjie Luo, Jingwen Chen, Yehao Li, Yingwei Pan, Jianlin Feng, Hongyang Chao, Ting Yao
PDF
Unleashing the Potential of the Semantic Latent Space in Diffusion Models for Image Dehazing Zizheng Yang, Hu Yu, Bing Li, Jinghao Zhang, Jie Huang, Feng Zhao
PDF
Unleashing the Power of Prompt-Driven Nucleus Instance Segmentation Zhongyi Shui, Yunlong Zhang, Kai Yao, Chenglu Zhu, Sunyi Zheng, Jingxiong Li, Honglin Li, Yuxuan Sun, Ruizhe Guo, Lin Yang
PDF
Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy Hong Zhang, Yixuan Lyu, Qian Yu, Hanyang Liu, Huimin Ma, Yuan Ding, Yifan Yang
PDF
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image Pengkun Jiao, Na Zhao, Jingjing Chen, Yu-Gang Jiang
PDF
Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents Yuqi Jia, Saeed Vahidian, Jingwei Sun, Jianyi Zhang, Vyacheslav Kungurtsev, Neil Zhenqiang Gong, Yiran Chen
PDF
Unmasking Bias in Diffusion Model Training Hu Yu, Li Shen, Jie Huang, Hongsheng Li, Feng Zhao
PDF
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement Lingyu Zhu, Wenhan Yang, Baoliang Chen, Hanwei Zhu, Zhangkai Ni, Qi Mao, Shiqi Wang
PDF
Unsqueeze [CLS] Bottleneck to Learn Rich Representations Qing Su, Shihao Ji
PDF
Unsupervised Dense Prediction Using Differentiable Normalized Cuts Yanbin Liu, Stephen Gould
PDF
Unsupervised Exposure Correction Ruodai Cui, Li Niu, Guosheng Hu
PDF
Unsupervised Moving Object Segmentation with Atmospheric Turbulence Dehao Qin, Ripon k Saha, Woojeh Chung, Suren Jayasuriya, Jinwei Ye, Nianyi Li
PDF
Unsupervised Multi-Modal Medical Image Registration via Invertible Translation Mengjie Guo
PDF
Unsupervised Representation Learning by Balanced Self Attention Matching Daniel Shalam, Simon Korman
PDF
Unsupervised Variational Translator for Bridging Image Restoration and High-Level Vision Tasks Jiawei Wu, Zhi Jin
PDF
Unsupervised, Online and On-the-Fly Anomaly Detection for Non-Stationary Image Distributions Declan GD McIntosh, Alexandra Branzan Albu
PDF
Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement Kun Zhou, Xinyu Lin, Wenbo Li, Xiaogang Xu, Yuanhao Cai, Zhonghang Liu, Xiaoguang Han, Jiangbo Lu
PDF
Unveiling and Mitigating Memorization in Text-to-Image Diffusion Models Through Cross Attention Jie Ren, Yaxin Li, Shenglai Zeng, Han Xu, Lingjuan Lyu, Yue Xing, Jiliang Tang
PDF
Unveiling Privacy Risks in Stochastic Neural Networks Training: Effective Image Reconstruction from Gradients Yiming Chen, Xiangyu Yang, Nikos Deligiannis
PDF
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu
PDF
UpFusion: Novel View Diffusion from Unposed Sparse View Observations Bharath Raj Nagoor Kani, Hsin-Ying Lee, Sergey Tulyakov, Shubham Tulsiani
PDF
UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues Vandad Davoodnia, Saeed Ghorbani, Marc-André Carbonneau, Alexandre Messier, Ali Etemad
PDF
Upper-Body Hierarchical Graph for Skeleton Based Emotion Recognition in Assistive Driving Jiehui Wu, Jiansheng Chen, Qifeng Luo, Siqi Liu, Youze Xue, Huimin Ma
PDF
Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang
PDF
URS-NeRF: Unordered Rolling Shutter Bundle Adjustment for Neural Radiance Fields Bo Xu, Liu Ziao, Mengqi Guo, Jiancheng Li, Gim Hee Lee
PDF
Using My Artistic Style? You Must Obtain My Authorization Xiuli Bi, Haowei Liu, Weisheng Li, Bo Liu, Bin Xiao
PDF
V-IRL: Grounding Virtual Intelligence in Real Life Jihan Yang, Runyu Ding, Ellis L Brown, Xiaojuan Qi, Saining Xie
PDF
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation Pooja Guhan, Tsung-Wei Huang, Guan-Ming Su, Subhadra Gopalakrishnan, Dinesh Manocha
PDF
V2X-Real: A Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception Hao Xiang, Xin Xia, Zhaoliang Zheng, Runsheng Xu, Letian Gao, Zewei Zhou, Xu Han, Xinkai Ji, Mingxi Li, Zonglin Meng, Li Jin, Mingyue Lei, Zhaoyang Ma, Zihang He, Haoxuan Ma, Yunshuang Yuan, Yingqian Zhao, Jiaqi Ma
PDF
Vamos: Versatile Action Models for Video Understanding Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun
PDF
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang
PDF
VCD-Texture: Variance Alignment Based 3D-2D Co-Denoising for Text-Guided Texturing Shang Liu, Chaohui Yu, Chenjie Cao, Wen Qian, Fan Wang
PDF
VCP-CLIP: A Visual Context Prompting Model for Zero-Shot Anomaly Segmentation Zhen Qu, Xian Tao, Mukesh Prasad, Fei Shen, Zhengtao Zhang, Xinyi Gong, Guiguang Ding
PDF
VeCLIP: Improving CLIP Training via Visual-Enriched Captions Zhengfeng Lai, Haotian Zhang, Bowen Zhang, Wentao Wu, Haoping Bai, Aleksei Timofeev, Xianzhi Du, Zhe Gan, Jiulong Shan, Chen-Nee Chuah, Yinfei Yang, Meng Cao
PDF
VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting Using Learned Priors Sungwon Hwang, Min-Jung Kim, Taewoong Kang, Jayeon Kang, Jaegul Choo
PDF
Veil Privacy on Visual Data: Concealing Privacy for Humans, Unveiling for DNNs Shuchao Pang, Ruhao Ma, Bing Li, Yongbin Zhou, Yazhou Yao
PDF
VEON: Vocabulary-Enhanced Occupancy Prediction Jilai Zheng, Pin Tang, Zhongdao Wang, Guoqing Wang, Xiangxuan Ren, Bailan Feng, Chao Ma
PDF
Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning Min-Yeong Park, Jae-Ho Lee, Gyeong-Moon Park
PDF
VersatileGaussian: Real-Time Neural Rendering for Versatile Tasks Using Gaussian Splatting Renjie Li, Zhiwen Fan, Bohua Wang, Peihao Wang, Zhangyang Wang, Xi Wu
PDF
VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking Jens Hellekes, Manuel Mühlhaus, Reza Bahmanyar, Seyed Majid Azimi, Franz Kurz
PDF
VF-NeRF: Viewshed Fields for Rigid NeRF Registration Leo Segre, Shai Avidan
PDF
VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models Junlin Han, Filippos Kokkinos, Philip Torr
PDF
ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders Jefferson Hernandez, Ruben Villegas, Vicente Ordonez
PDF
Video Editing via Factorized Diffusion Distillation Uriel Singer, Amit Zohar, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman
PDF
Video Question Answering with Procedural Programs Rohan Choudhury, Koichiro Niinuma, Kris Kitani, Laszlo A Jeni
PDF
VideoAgent: A Memory-Augmented Multimodal Agent for Video Understanding Yue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li
PDF
VideoAgent: Long-Form Video Understanding with Large Language Model as Agent Xiaohan Wang, Yuhui Zhang, Orr Zohar, Serena Yeung-Levy
PDF
VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos Devesh Walawalkar, Pablo Garrido
PDF
VideoMamba: Spatio-Temporal Selective State Space Model Jinyoung Park, Hee-Seon Kim, Kangwook Ko, Minbeom Kim, Changick Kim
PDF
VideoMamba: State Space Model for Efficient Video Understanding Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao
PDF
Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion Xiang Fan, Anand Bhattad, Ranjay Krishna
PDF
VideoStudio: Generating Consistent-Content and Multi-Scene Videos Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei
PDF
View Selection for 3D Captioning via Diffusion Ranking Tiange Luo, Justin Johnson, Honglak Lee
PDF
View-Consistent 3D Editing with Gaussian Splatting Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, Hanwang Zhang
PDF
View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields Haodi He, Colton Stearns, Adam Harley, Leonidas Guibas
PDF
ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers Jinke Li, Xiao He, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang
PDF
Viewpoint Textual Inversion: Discovering Scene Representations and 3D View Control in 2D Diffusion Models James Burgess, Kuan-Chieh Wang, Serena Yeung-Levy
PDF
ViG-Bias: Visually Grounded Bias Discovery and Mitigation Badr-Eddine Marani, Mohamed Hanini, Nihitha Malayarukil, Stergios Christodoulidis, Maria Vakalopoulou, Enzo Ferrante
PDF
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling Siming Yan, Min Bai, Weifeng Chen, Xiong Zhou, Qixing Huang, Li Erran Li
PDF
ViLA: Efficient Video-Language Alignment for Video Question Answering Xijun Wang, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming C Lin, Shan Yang
PDF
ViPer: Visual Personalization of Generative Models via Individual Preference Learning Sogand Salehi, Mahdi Shafiei, Roman Bachmann, Teresa Yeo, Amir Zamir
PDF
VISA: Reasoning Video Object Segmentation via Large Language Model Cilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang, Weidi Xie, Efstratios Gavves
PDF
VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement Hanjung Kim, Jaehyun Kang, Miran Heo, Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim
PDF
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding Ofir Abramovich, Niv Nayman, Sharon Fogel, Inbal Lavi, Ron Litman, Shahar Tsiper, Royee Tichauer, Srikar Appalaraju, Shai Mazor, R. Manmatha
PDF
Visible and Clear: Finding Tiny Objects in Difference mAP Bing Cao, Haiyu Yao, Pengfei Zhu, Qinghua Hu
PDF
Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment Huangbiao Xu, Xiao Ke, Yuezhou Li, Rui Xu, Huanqi Wu, Xiaofeng Lin, Wenzhong Guo
PDF
Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection Zihan Zhang, Zhuo Xu, Xiang Xiang
PDF
VisionLLaMA: A Unified Llama Backbone for Vision Tasks Xiangxiang Chu, Jianlin Su, Bo Zhang, Chunhua Shen
PDF
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions Seokha Moon, Hyun Woo, Hongbeen Park, Haeji Jung, Reza Mahjourian, Hyung-gun Chi, Hyerin Lim, Sangpil Kim, Jinkyu Kim
PDF
Vista3D: Unravel the 3D Darkside of a Single Image Qiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang
PDF
Visual Alignment Pre-Training for Sign Language Translation Peiqi Jiao, Yuecong Min, Xilin Chen
PDF
Visual Grounding for Object-Level Generalization in Reinforcement Learning Haobin Jiang, Zongqing Lu
PDF
Visual Prompting via Partial Optimal Transport Mengyu Zheng, Zhiwei Hao, Yehui Tang, Chang Xu
PDF
Visual Relationship Transformation Xiaoyu Xu, Jiayan Qiu, Baosheng Yu, Zhou Wang
PDF
Visual Text Generation in the Wild Yuanzhi Zhu, Jiawei Liu, Feiyu Gao, Wenyu Liu, Xinggang Wang, Peng Wang, Fei Huang, Cong Yao, Zhibo Yang
PDF
VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models Shicheng Li, Lei Li, Yi Liu, Shuhuai Ren, Yuanxin Liu, Rundong Gao, Xu Sun, Lu Hou
PDF
VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation Wenjie Zhuo, Fan Ma, Hehe Fan, Yi Yang
PDF
VLAD-BuFF: Burst-Aware Fast Feature Aggregation for Visual Place Recognition Ahmad Khaliq, Ming Xu, Stephen Hausler, Michael J Milford, Sourav Garg
PDF
Volumetric Rendering with Baked Quadrature Fields Gopal Sharma, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi
PDF
VP-SAM: Taming Segment Anything Model for Video Polyp Segmentation via Disentanglement and Spatio-Temporal Side Network Zhixue Fang, Yuzhi Liu, Huisi Wu, Jing Qin
PDF
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno
PDF
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving Yibo Liu, Zheyuan Yang, Guile Wu, Yuan Ren, Kejian Lin, Liu Bingbing, Yang Liu, Jinjun Shan
PDF
VSViG: Real-Time Video-Based Seizure Detection via Skeleton-Based Spatiotemporal ViG Yankun Xu, Junzhe Wang, Yun-Hsuan Chen, Jie Yang, Wenjie Ming, Shuang Wang, Mohamad Sawan
PDF
Walker: Self-Supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs Mattia Segù, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele
PDF
WAS: Dataset and Methods for Artistic Text Segmentation Xudong Xie, Yuzhe Li, Yang Liu, Zhifei Zhang, Zhaowen Wang, Wei Xiong, Xiang Bai
PDF
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians Dmytro Kotovenko, Olga Grebenkova, Nikolaos Sarafianos, Avinash Paliwal, Pingchuan Ma, Omid Poursaeed, Sreyas Mohan, Yuchen Fan, Yilei Li, Rakesh Ranjan, Bjorn Ommer
PDF
Watch Your Steps: Local Image and Scene Editing by Text Instructions Ashkan Mirzaei, Tristan T Aumentado-Armstrong, Marcus A Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G Derpanis, Igor Gilitschenski
PDF
Watching It in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination Yunan Li, Yihao Zhang, Shoude Li, Long Tian, Dou Quan, Chaoneng Li, Qiguang Miao
PDF
WAVE: Warping DDIM Inversion Features for Zero-Shot Text-to-Video Editing Yutang Feng, Sicheng Gao, Yuxiang Bao, Xiaodi Wang, Shumin Han, Juan Zhang, Baochang Zhang, Angela Yao
PDF
Wavelength-Embedding-Guided Filter-Array Transformer for Spectral Demosaicing Haijin Zeng, Hiep Luong, Wilfried Philips
PDF
Wavelet Convolutions for Large Receptive Fields Shahaf E Finder, Roy Amoyal, Eran Treister, Oren Freifeld
PDF
WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning Kunbei Cai, Zhenkai Zhang, Qian Lou, Fan Yao
PDF
Weak-to-Strong Compositional Learning from Generative Models for Language-Based Object Detection Kwanyong Park, Kuniaki Saito, Donghyun Kim
PDF
Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance Kuan-Chih Huang, Yi-Hsuan Tsai, Ming-Hsuan Yang
PDF
Weakly Supervised Co-Training with Swapping Assignments for Semantic Segmentation Xinyu Yang, Hossein Rahmani, Dame S Black, Bryan M Williams
PDF
Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance Yufei Zhang, Jeffrey Kephart, Qiang Ji
PDF
Weakly-Supervised Camera Localization by Ground-to-Satellite Image Registration Yujiao Shi, Hongdong Li, Akhil Perincherry, Ankit Vora
PDF
Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment Yang Jin, Yadong Mu
PDF
Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment Mengting Chen, Xi Chen, Zhonghua Zhai, Chen Ju, Xuewen Hong, Jinsong Lan, Shuai Xiao
PDF
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu, Jiajun Bu, Qi Zheng, Cong Yao
PDF
WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model Haisheng Fu, Jie Liang, Zhenman Fang, Jingning Han, Feng Liang, Guohe Zhang
PDF
WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-Only Supervised Text Spotting Jingjing Wu, Zhengyao Fang, Pengyuan Lyu, Chengquan Zhang, Fanglin Chen, Guangming Lu, Wenjie Pei
PDF
Weight Conditioning for Smooth Optimization of Neural Networks Hemanth Saratchandran, Thomas X Wang, Simon Lucey
PDF
Weighted Ensemble Models Are Strong Continual Learners Imad Eddine Marouf, Subhankar Roy, Enzo Tartaglione, Stéphane Lathuilière
PDF
Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation Prantik Howlader, Hieu Le, Dimitris Samaras
PDF
WHAC: World-Grounded Humans and Cameras Wanqi Yin, Zhongang Cai, Chen Wei, Fanzhou Wang, Ruisi Wang, Haiyi Mei, Weiye Xiao, Zhitao Yang, Qingping Sun, Atsushi Yamashita, Ziwei Liu, Lei Yang
PDF
When and How Do Negative Prompts Take Effect? Yuanhao Ban, Ruochen Wang, Tianyi Zhou, Minhao Cheng, Boqing Gong, Cho-Jui Hsieh
PDF
When Do We Not Need Larger Vision Models? Baifeng Shi, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell
PDF
When Fast Fourier Transform Meets Transformer for Image Restoration Xingyu Jiang, Xiuhui Zhang, Ning Gao, Yue Deng
PDF
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset Yi Zhang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu
PDF
Where Am I? Scene Retrieval with Language Jiaqi Chen, Daniel Barath, Iro Armeni, Marc Pollefeys, Hermann Blum
PDF
Which Model Generated This Image? a Model-Agnostic Approach for Origin Attribution Fengyuan Liu, Haochen Luo, Yiming Li, Philip Torr, Jindong Gu
PDF
WildRefer: 3D Object Localization in Large-Scale Dynamic Scenes with Multi-Modal Visual Data and Natural Language Zhenxiang Lin, Xidong Peng, Peishan Cong, Ge Zheng, Yujing Sun, Yuenan Hou, Xinge Zhu, Sibei Yang, Yuexin Ma
PDF
WildVidFit: Video Virtual Try-on in the Wild via Image-Based Controlled Diffusion Models Zijian He, Peixin Chen, Guangrun Wang, Guanbin Li, Philip Torr, Liang Lin
PDF
WiMANS: A Benchmark Dataset for WiFi-Based Multi-User Activity Sensing Shuokang Huang, Kaihan Li, Di You, Yichong Chen, Arvin Lin, Siying Liu, Xiaohui Li, Julie A. McCann
PDF
WindPoly: Polygonal Mesh Reconstruction via Winding Numbers Xin He, Chenlei Lv, Pengdi Huang, Hui Huang
PDF
Within the Dynamic Context: Inertia-Aware 3D Human Modeling with Pose Sequence Yutong Chen, Yifan Zhan, Zhihang Zhong, Wei Wang, Xiao Sun, Yu Qiao, Yinqiang Zheng
PDF
WordRobe: Text-Guided Generation of Textured 3D Garments Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma
PDF
WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation Tianjian Jiang, Johsan Billingham, Sebastian Müksch, Juan J Zarate, Nicolas Evans, Martin R. Oswald, Marc Pollefeys, Otmar Hilliges, Manuel Kaufmann, Jie Song
PDF
WoVoGen: World Volume-Aware Diffusion for Controllable Multi-Camera Driving Scene Generation Jiachen Lu, Ze Huang, Zeyu Yang, Zhang Jiahui, Li Zhang
PDF
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models Xin-Jian Wu, Ruisong Zhang, Jie Qin, Shijie Ma, Cheng-Lin Liu
PDF
WRIM-Net: Wide-Ranging Information Mining Network for Visible-Infrared Person Re-Identification Yonggan Wu, Ling-Chao Meng, Yuan Zichao, Sixian Chan, Hong-Qiang Wang
PDF
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering Pingyi Chen, Chenglu Zhu, Sunyi Zheng, Honglin Li, Lin Yang
PDF
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-Grained Spatial-Temporal Understanding Quan Kong, Yuki Kawana, Rajat Saini, Ashutosh Kumar, Jingjing Pan, Ta Gu, Yohei Ozao, Balazs Opra, Yoichi Sato, Norimasa Kobori
PDF
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs Sirnam Swetha, Jinyu Yang, Tal Neiman, Mamshad Nayeem Rizve, Son Tran, Benjamin Yao, Trishul A Chilimbi, Mubarak Shah
PDF
X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and Its Emergent Cross-Modal Reasoning Artemis Panagopoulou, Le Xue, Ning Yu, Li Junnan, Dongxu Li, Shafiq Joty, Ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles
PDF
X-Pose: Detecting Any Keypoints Jie Yang, Ailing Zeng, Ruimao Zhang, Lei Zhang
PDF
XPSR: Cross-Modal Priors for Diffusion-Based Image Super-Resolution Qu Yunpeng, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou
PDF
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao
PDF
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception Sheng Jin, Shuhuai Li, Tong Li, Wentao Liu, Chen Qian, Ping Luo
PDF
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
PDF
Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems Yasar U Alcalar, Mehmet Akcakaya
PDF
Zero-Shot Detection of AI-Generated Images Davide Cozzolino, GIovanni Poggi, Matthias Niessner, Luisa Verdoliva
PDF
Zero-Shot Image Feature Consensus with Deep Functional Maps Xinle Cheng, Congyue Deng, Adam Harley, Yixin Zhu, Leonidas Guibas
PDF
Zero-Shot Multi-Object Scene Completion Shun Iwase, Katherine Liu, Vitor Guizilini, Adrien Gaidon, Kris Kitani, Rareș A Ambruș, Sergey Zakharov
PDF
Zero-Shot Object Counting with Good Exemplars Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Xian Zhong, Zheng Wang, Shengfeng He
PDF
Zero-Shot Text-Guided Infinite Image Synthesis with LLM Guidance Soyeong Kwon, Taegyeong Lee, Taehwan Kim
PDF
ZeroI2V: Zero-Cost Adaptation of Pre-Trained Transformers from Image to Video Xinhao Li, Yuhan Zhu, Limin Wang
PDF
ZeST: Zero-Shot Material Transfer from a Single Image Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, Varun Jampani
PDF
ZigMa: A DiT-Style Zigzag Mamba Diffusion Model Vincent Tao Hu, Stefan A Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes S Fischer, Bjorn Ommer
PDF
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani
PDF
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model Fu-Yun Wang, Zhaoyang Huang, Qiang Ma, Guanglu Song, Xudong Lu, Weikang Bian, Yijin Li, Yu Liu, Hongsheng Li
PDF