CVPRW 2025

659 papers

3D Face Reconstruction from Radar Images Valentin Braeutigam, Vanessa Wirth, Ingrid Ullmann, Christian Schüßler, Martin Vossiek, Matthias Berking, Bernhard Egger
PDF
3rd Multi-Modal Aerial View Image Challenge: Sensor Domain Translation - PBVS 2025 Dylan Bowald, Justice Wheelwright, Oliver Nina, Ángel D. Sappa, Riad I. Hammoud, Erik Blasch, Nathan Inkawhich
PDF
4th Multi-Modal Aerial View Image Challenge: SAR Classification - PBVS 2025 Nathan Inkawhich, Claire Thorp, Justice Wheelwright, Oliver Nina, Dylan Bowald, Ángel D. Sappa, Erik Blasch
PDF
A Dataset for Semantic and Instance Segmentation of Modern Fruit Orchards Tieqiao Wang, Abhinav Jain, Liqiang He, Cindy Grimm, Sinisa Todorovic
PDF
A Fine-Grained Artist Identification Method for Authentication and Attribution of Drawings Using Hatching Lines Shahrzad Ziaee, Ahmed Elgammal, Marian Mazzone
PDF
A Generative AI Game Jam Case Study from October 2024 Josef B. Spjut
PDF
A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning Akash Kumar, Ashlesha Kumar, Vibhav Vineet, Yogesh S. Rawat
PDF
A Lightweight Moment Retrieval System with Global Re-Ranking and Robust Adaptive Bidirectional Temporal Search Tinh-Anh Nguyen-Nhu, Huu-Loc Tran, Nguyen-Khang Le, Minh-Nhat Nguyen, Tien-Huy Nguyen, Hoang-Long Nguyen-Huu, Huu-Phong Phan-Nguyen, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh
PDF
A Novel 3D Decoder with Weighted and Learnable Triple Attention for 3D Microscopy Image Segmentation Siyavash Shabani, Sahar A. Mohammed, Bahram Parvin
PDF
A Semi-Self-Supervised Approach for Dense-Pattern Video Object Segmentation Keyhan Najafian, Farhad Maleki, Lingling Jin, Ian Stavness
PDF
A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation via Synergistic Pseudo-Labeling and Generative Learning Anan Yaghmour, Melba M. Crawford, Saurabh Prasad
PDF
A Simple Combination of Diffusion Models for Better Quality Trade-Offs in Image Denoising Jonas Dornbusch, Emanuel Pfarr, Florin-Alexandru Vasluianu, Frank Werner, Radu Timofte
PDF
A Simple Detector with Frame Dynamics Is a Strong Tracker Chenxu Peng, Chenxu Wang, Minrui Zou, Danyang Li, Zhengpeng Yang, Yimian Dai, Ming-Ming Cheng, Xiang Li
PDF
A Strong Baseline for Multi-Person Tracking in Thermal Infrared Imagery Daniel Stadler, Andreas Specker
PDF
A Survey of State of the Art Large Vision Language Models: Benchmark Evaluations and Challenges Zongxia Li, Xiyang Wu, Hongyang Du, Fuxiao Liu, Huy Nghiem, Guangyao Shi
PDF
A True Hyperspectral Image Super-Resolution Dataset Alexander Ulrichsen, Thomas De Kerf, David Dunphy, Paul Murray, Steve Vanlanduit, Stephen Marshall
PDF
A Visual RAG Pipeline for Few-Shot Fine-Grained Product Classification Bianca Lamm, Janis Keuper
PDF
Action Anticipation from SoccerNet Football Video Broadcasts Mohamad Dalal, Artur Xarles, Anthony Cioppa, Silvio Giancola, Marc Van Droogenbroeck, Bernard Ghanem, Albert Clapés, Sergio Escalera, Thomas B. Moeslund
PDF
Action Valuation in Sports: A Survey Artur Xarles, Sergio Escalera, Thomas B. Moeslund, Albert Clapés
PDF
ActNAS : Generating Efficient YOLO Models Using Activation NAS Sudhakar Sah, Ravish Kumar, Darshan C. Ganji, Ehsan Saboori
PDF
ADAPTOR: Adaptive Token Reduction for Video Diffusion Transformers Elia Peruzzo, Adil Karjauv, Nicu Sebe, Amir Ghodrati, AmirHossein Habibian
PDF
AdaVid: Adaptive Video-Language Pretraining Chaitanya Patel, Juan Carlos Niebles, Ehsan Adeli
PDF
Advancements in Affective and Behavior Analysis: The 8th ABAW Workshop and Competition Dimitrios Kollias, Panagiotis Tzirakis, Alan Cowen, Stefanos Zafeiriou, Irene Kotsia, Eric Granger, Marco Pedersoli, Simon Bacon, Alice Baird, Chris Gagne, Chunchang Shao, Guanyu Hu, Soufiane Belharbi, Muhammad Haseeb Aslam
PDF
Advancing Ambient Lighting Normalization via Diffusion Shadow Generation Xin Lu, Jiarong Yang, Yuanfei Bao, Zihao Fan, Anya Hu, Kunyu Wang, Jie Xiao, Xi Wang, Hongjian Liu, Xueyang Fu, Zheng-Jun Zha
PDF
Advancing Facial Age Progression for Occluded Faces Ankit Birla, Akshay Agarwal
PDF
Adversarially Domain-Adaptive Latent Diffusion for Unsupervised Semantic Segmentation Jongmin Yu, Zhongtian Sun, Chen Bene Chi, Jinhong Yang, Shan Luo
PDF
Aerial Infrared Health Monitoring of Solar Photovoltaic Farms at Scale Isaac Corley, Conor Wallace, Sourav Agrawal, Burton Putrah, Jonathan Lwowski
PDF
AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images Saikat Dutta, Akhil Vasim, Siddhant Gole, Hamid Rezatofighi, Biplab Banerjee
PDF
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification Earl Ranario, Lars Lundqvist, Heesup Yun, Brian N. Bailey, J. Mason Earles
PDF
Agri-FM+: A Self-Supervised Foundation Model for Agricultural Vision Md Jaber Al Nahian, Tapotosh Ghosh, Farnaz Sheikhi, Farhad Maleki
PDF
Agro-Net: A Convolution-Attention Fusion Based Hyperspectral Model for Agro-Food Quality Assessment Ocean Monjur, Md. Toukir Ahmed, Md Wadud Ahmed, Mohammed Kamruzzaman
PDF
AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening Frank P.-W. Lo, Jianing Qiu, Zeyu Wang, Haibao Yu, Yeming Chen, Gao Zhang, Benny Lo
PDF
AI-Based Video Content Understanding for Automatic and Interactive Multimedia Retrieval Klaus Schoeffmann, Mario Leopold
PDF
An Efficient and Scalable Framework for Lightweight Crop Disease Recognition in Low-Resource Settings Tushar Shinde
PDF
An Empirical Study for Efficient Video Quality Assessment Wei Sun, Kang Fu, Linhan Cao, Dandan Zhu, Kaiwei Zhang, Yucheng Zhu, Zicheng Zhang, Menghan Hu, Xiongkuo Min, Guangtao Zhai
PDF
An End-to-End Pipeline for Virtual Banner Replacement in Football Broadcasts Victor Gaspar, Anthony Cioppa, Jan Held, Silvio Giancola, Marc Braham, Adrien Deliège, Bernard Ghanem, Marc Van Droogenbroeck
PDF
An Interactive Agent Foundation Model Zane Durante, Ran Gong, Bidipta Sarkar, Naoki Wake, Rohan Taori, Paul Tang, Shrinidhi Kowshika Lakshmikanth, Kevin A. Schulman, Arnold Milstein, Hoi Vo, Ehsan Adeli, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao
PDF
An LLM Framework for Long-Form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5 Damianos Galanopoulos, Andreas Goulas, Antonios Leventakis, Ioannis Patras, Vasileios Mezaris
PDF
An LLM-Enabled Multi-Agent Autonomous Mechatronics Design Framework Zeyu Wang, Frank Po Wen Lo, Qian Chen, Yongqi Zhang, Chen Lin, Xu Chen, Zhenhua Yu, Alexander J. Thompson, Eric M. Yeatman, Benny P. L. Lo
PDF
Analyzing Hierarchical Structure in Vision Models with Sparse Autoencoders Matthew Lyle Olson, Musashi Hinck, Neale Ratzlaff, Changbai Li, Phillip Howard, Vasudev Lal, Shao-Yen Tseng
PDF
AnomalyHybrid: A Domain-Agnostic Generative Framework for General Anomaly Detection Ying Zhao
PDF
AppleGrowthVision: A Large-Scale Stereo Dataset for Phenological Analysis, Fruit Detection, and 3D Reconstruction in Apple Orchards Laura von Hirschhausen, Jannes S. Magnusson, Mykyta Kovalenko, Fredrik Boye, Tanay Rawat, Peter Eisert, Anna Hilsmann, Sebastian Pretzsch, Sebastian Bosse
PDF
ARC-NeRF: Area Ray Casting for Broader Unseen View Coverage in Few-Shot Object Rendering Seunghyeon Seo, Yeonjin Chang, Jayeon Yoo, Seungwoo Lee, Hojun Lee, Nojun Kwak
PDF
ARDGen: Augmentation Regularization for Domain-Generalized Medical Report Generation Syed Bilal Ahsan, Muhammad Ikhalas, Muhammad Muzamil Khan, Sana Ullah, Muhammad Zaigham Zaheer
PDF
Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition Sergio Romero-Tapiador, Ruben Tolosana, Blanca Lacruz-Pleguezuelos, Laura Judith Marcos-Zambrano, Guadalupe X. Bazán, Isabel Espinosa-Salinas, Julian Fierrez, Javier Ortega-Garcia, Enrique Carrillo de Santa Pau, Aythami Morales
PDF
AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements Calvin Yeung, Tomohiro Suzuki, Ryota Tanaka, Zhuoer Yin, Keisuke Fujii
PDF
Attacking Attention of Foundation Models Disrupts Downstream Tasks Hondamunige Prasanna Silva, Federico Becattini, Lorenzo Seidenari
PDF
Attention-Aware Temporal Adversarial Shadows on Traffic Sign Sequences Pedram MohajerAnsari, Amir Salarpour, David Fernandez, Cigdem Kokenoz, Bing Li, Mert D. Pesé
PDF
Attention-Guided Hierarchical Defense for Multimodal Attacks in Vision-Language Models Long Chen, Yuling Chen, Yun Luo, Hui Dou, Xinyang Zhong
PDF
AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection Loveneet Saini, Mirko Meuter, Hasan Tercan, Tobias Meisen
PDF
Augmented Reality Applications Using Active Markers with an Event Camera Shintaro Shiba, Quan Kong, Norimasa Kobori
PDF
Automated Essential Concept Discovery for Few-Shot Out-of-Distribution Detection Guangyao Chen, Kai A. Horstmann, Zhilong Wang, Fengqi You
PDF
Autonomous Multimodal Reasoning via Implicit Chain-of-Vision Yiqiao Huang, Qi He, Zhaorun Chen, Haopeng Zhang, Hanchao Yu, Zhuokai Zhao
PDF
Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization Nazia Aslam, Kamal Nasrollahi
PDF
Behind the Magic, MERLIM: Multi-Modal Evaluation Benchmark for Large Image-Language Models Andrés Villa, Juan León Alcázar, Alvaro Soto, Bernard Ghanem
PDF
Benchmarking Multi-Modal Semantic Segmentation Under Sensor Failures: Missing and Noisy Modality Robustness Chenfei Liao, Kaiyu Lei, Xu Zheng, Junha Moon, Zhixiong Wang, Yixuan Wang, Danda Pani Paudel, Luc Van Gool, Xuming Hu
PDF
Best Linear Unbiased Estimation for 2D and 3D Flow with Event-Based Cameras Juan Luis Valerdi, Xabier Iturbe
PDF
Better Coherence, Better Height: Fusing Physical Models and Deep Learning for Forest Height Estimation from Interferometric SAR Data Ragini Bal Mahesh, Ronny Hänsch
PDF
Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection Aimira Baitieva, Yacine Bouaouni, Alexandre Briot, Dick Ameln, Souhaiel Khalfaoui, Samet Akcay
PDF
Beyond Neurofibrillary Tangles: Explainable AI for Microscopic Tauopathy Classification in Immunofluorescence Imaging Jesus Dassaef López-Barrios, Miguel Angel Ontiveros-Torres, Jose Antonio Cantoral-Ceballos
PDF
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model Lu Xu, Sijie Zhu, Chunyuan Li, Chia-Wen Kuo, Fan Chen, Xinyao Wang, Guang Chen, Dawei Du, Ye Yuan, Longyin Wen
PDF
BiasBench: A Reproducible Benchmark for Tuning the Biases of Event Cameras Andreas Ziegler, David Joseph, Thomas Gossard, Emil Moldovan, Andreas Zell
PDF
BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models Huu-Thien Tran, Thanh-Dat Truong, Khoa Luu
PDF
BRAT: Bidirectional Relative Positional Attention Transformer for Event-Based Eye Tracking Yuliang Wu, Han Han, Jinze Chen, Wei Zhai, Yang Cao, Zhengjun Zha
PDF
Bridging Classical and Modern Computer Vision: PerceptiveNet for Tree Crown Semantic Segmentation Georgios Voulgaris
PDF
Bridging Detection and Re-Identification: Evaluating Trustworthiness and Error Propagation in Face Recognition Pipelines Kuan Yew Leong, Jaeseung Han
Bridging Morphology and Molecular Signatures: Multi-Task Deep Learning for Multi-Omics Prediction from Histopathology Fatemeh Dashti Ahangar, Jiann-Shiun Yuan
PDF
Bridging Self-Supervision and Mechanism of Action Discovery in Morphological Profiling Syed Sameed Husain, Jan Bober, Amaia Irizar, Miroslaw Bober
PDF
Bridging the Modality Gap: Training-Free Adaptation of Vision-Language Models for Remote Sensing via Visual Prototypes Clément Barbier, Baptiste Abeloss, Stéphane Herbin
PDF
CACP: Context-Aware Copy-Paste to Enrich Image Content for Data Augmentation Qiushi Guo, Shaoxiang Wang, Chun-Peng Chang, Jason R. Rambach
PDF
CaddieSet: A Golf Swing Dataset with Human Joint Features and Ball Information Seunghyeon Jung, Seoyoung Hong, Jiwoo Jeong, Seungwon Jeong, Jaerim Choi, Hoki Kim, Woojin Lee
PDF
CadenceRAG: Context-Aware and Dependency-Enhanced Retrieval Augmented Generation for Holistic Video Understanding Heng Liu, Siru Jiang, Fangyun Duan, Yongzhe Lyu, Xiusong Wang, Hanlin Ge, Chao Liang
PDF
California Crop Yield Benchmark: Combining Satellite Image, Climate, Evapotranspiration, and Soil Data Layers for County-Level Yield Forecasting of over 70 Crops Hamid Kamangir, Mona Hajiesmaeeli, J. Mason Earles
PDF
Camera-Only 3D Panoptic Scene Completion for Autonomous Driving Through Differentiable Object Shapes Nicola Marinello, Simen Cassiman, Jonas Heylen, Marc Proesmans, Luc Van Gool
PDF
Can Geometry Save Central Views for Sports Field Registration? Floriane Magera, Thomas Hoyoux, Martin Castin, Olivier Barnich, Anthony Cioppa, Marc Van Droogenbroeck
PDF
Can Relevance Feedback, Conversational Search and Foundation Models Work Together for Interactive Video Search and Exploration? Ujjwal Sharma, Omar Shahbaz Khan, Stevan Rudinac, Björn Þór Jónsson
PDF
Can Vision-Language Models Understand and Interpret Dynamic Gestures from Pedestrians? Pilot Datasets and Exploration Towards Instructive Nonverbal Commands for Cooperative Autonomous Vehicles Tonko E. W. Bossen, Andreas Møgelmose, Ross Greer
PDF
CARN: Complexity-Aware Routing Network for Efficient and Adaptive Inference Rebati Raman Gaire, Arman Roohi
PDF
CDVS: Compressed Domain on Device Memory Efficient 8k Video SlowMo Jing Li, Chengyu Wang, Hamid R. Sheikh, Seok-Jun Lee
PDF
CE-NPBG: Connectivity Enhanced Neural Point-Based Graphics for Novel View Synthesis in Autonomous Driving Scenes Mohammad Altillawi, Fengyi Shen, Liudi Yang, Sai Manoj Prakhya, Ziyuan Liu
PDF
CellRep: A Multichannel Image Representation Learning Model Lawrence Phillips, Rory M. Donovan-Maiye
PDF
Choosing 'Right' from Wrong: A Closer Look at Selection Bias in Spatial Multiple-Choice Questions in Large Multimodal Models Giselle Zeno, Nour Jedidi, Steven Gomez
PDF
CityGen: Infinite and Controllable City Layout Generation Jie Deng, Wenhao Chai, Jianshu Guo, Qixuan Huang, Junsheng Huang, Wenhao Hu, Shengyu Hao, Jenq-Neng Hwang, Gaoang Wang
PDF
Classification Drives Geographic Bias in Street Scene Segmentation Rahul Nair, Bhanu Tokas, Gabriel Tseng, Esther Rolf, Hannah Kerner
CleanMAP: Distilling Multimodal LLMs for Confidence-Driven Crowdsourced HD mAP Updates Ankit Kumar Shaw, Kun Jiang, Tuopu Wen, Chandan Kumar Sah, Yining Shi, Mengmeng Yang, Diange Yang, Xiaoli Lian
PDF
CLIP-SLA: Parameter-Efficient CLIP Adaptation for Continuous Sign Language Recognition Sarah N. Alyami, Hamzah Luqman
PDF
Clip4Retrofit: Enabling Real-Time Image Labeling on Edge Devices via Cross-Architecture CLIP Distillation Li Zhong, Ahmed Ghazal, Jun-Jun Wan, Frederik Zilly, Patrick Mackens, Joachim E. Vollrath, Bogdan Sorin Coseriu
PDF
CLIPDraw++: Text-to-Sketch Synthesis with Simple Primitives Nityanand Mathur, Shyam Marjit, Abhra Chaudhuri, Anjan Dutta
PDF
CoDEx: Combining Domain Expertise for Spatial Generalization in Satellite Image Analysis Abhishek Kuriyal, Elliot Vincent, Mathieu Aubry, Loïc Landrieu
PDF
Combining Vision-Language Models and Weak Supervision for Nuanced Vision Classification Tasks Seyed Mohamad Ali Tousi, Jacket Demby's, Ramy Farag, Gbenga Omotara, Guilherme N. DeSouza
PDF
Comparison Visual Instruction Tuning Wei Lin, Muhammad Jehanzeb Mirza, Sivan Doveh, Rogério Feris, Raja Giryes, Sepp Hochreiter, Leonid Karlinsky
PDF
Compositional Image-Text Matching and Retrieval by Grounding Entities Madhukar Reddy Vongala, Saurabh Srivastava, Jana Kosecka
PDF
Compressed Domain Multiframe Processing Chengyu Wang, Jing Li, Saurabh Kumar, Seok-Jun Lee, Hamid R. Sheikh
PDF
CondiMen: Conditional Multi-Person Mesh Recovery Romain Brégier, Fabien Baradel, Thomas Lucas, Salma Galaaoui, Matthieu Armando, Philippe Weinzaepfel, Grégory Rogez
PDF
Confidence-Calibrated Covariate Shift Correction for Few-Shot Classification in Vision-Language Models Behraj Khan, Rizwan Qureshi, Nouman M. Durrani, Tahir Qasim Syed
PDF
conSAMme: Achieving Consistent Segmentations with SAM Josh Myers-Dean, Kangning Liu, Brian L. Price, Yifei Fan, Jason Kuen, Danna Gurari
PDF
Coordinated Robustness Evaluation Framework for Vision-Language Models Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Sahand Ghorbanpour, Avisek Naug, Antonio Guillen, Ricardo Luna, Soumyendu Sarkar
PDF
COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails Miguel Espinosa, Valerio Marsocci, Yuru Jia, Elliot Crowley, Mikolaj Czerkawski
PDF
Cross-Modal Consistency Learning for Sign Language Recognition Kepeng Wu, Zecheng Li, Weichao Zhao, Hezhen Hu, Wengang Zhou, Houqiang Li
PDF
Cross-Modal Facial Expression Recognition with Global Channel-Spatial Attention: Modal Enhancement and Proportional Criterion Fusion Jun Yu, Yang Zheng, Lei Wang, Yongqi Wang, Shengfan Xu
PDF
Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF Anirudh Nanduri, Siyuan Huang, Rama Chellappa
PDF
CSRN: Cross-Sensor Robust Recognition Network for Multi-Modal Aerial View Object Classification Hongli Liu, Wang Yu, Shengjie Zhao
PDF
CTC: Contribution to Classification of Complex Features Sophia Kalanovska, Michael Luck, Christopher Hampson
PDF
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection Huu-Phong Phan-Nguyen, Anh Dao, Tien-Huy Nguyen, Tuan Quang, Huu-Loc Tran, Tinh-Anh Nguyen-Nhu, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh
PDF
CYFLOD: Cyclic Filtering and Loss Damping for Alleviating Noisy Labels in Fine-Grained Visual Classification Nauman Ullah Gilal, Khaled A. Al-Thelaya, Fahad Majeed, Zhihe Lu, Sabri Boughorbel, Jens Schneider, Marco Agus
PDF
CytoFM: The First Cytology Foundation Model Vedrana Ivezic, Ashwath Radhachandran, Ekaterina Redekop, Shreeram Athreya, Dongwoo Lee, Vivek Sant, Corey W. Arnold, William Speier
PDF
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition Rupayan Mallick, Sibo Dong, Nataniel Ruiz, Sarah Adel Bargal
PDF
DAF: Distillation, Augmentation and Filtering Based Framework for Efficient Smartphone Human Activity Recognition Ujjal Kr Dutta, Guan-Ming Su
PDF
Data Scaling Laws for End-to-End Autonomous Driving Alexander Naumann, Xunjiang Gu, Tolga Dimlioglu, Mariusz Bojarski, Alperen Degirmenci, Alexander Popov, Devansh Bisla, Marco Pavone, Urs Muller, Boris Ivanovic
PDF
DataFormer: Differential Additive Transformer for Lightweight Semantic Segmentation Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Radu Timofte
PDF
Datasets for Valence and Arousal Inference: A Survey Helen Schneider, Svetlana Pavlitska, Helen Gremmelmaier, Marius Zöllner
PDF
DCSEG: Decoupled 3D Open-Set Segmentation Using Gaussian Splatting Luis Wiedmann, Luca Wiehe, Dávid Rozenberszki
PDF
Deciding the Path: Leveraging Multi-Agent Systems for Solving Complex Tasks Iman Abbasnejad, Xuefeng Liu, Atanu Roy
PDF
DeclutterNeRF: Generative-Free 3D Scene Recovery for Occlusion Removal Wanzhou Liu, Zhexiao Xiong, Xinyu Li, Nathan Jacobs
PDF
Decoding Vision Transformers: The Diffusion Steering Lens Ryota Takatsuki, Sonia Joseph, Ippei Fujisawa, Ryota Kanai
PDF
Decomposing Food Images for Better Nutrition Analysis: A Nutritionist-Inspired Two-Step Multimodal LLM Approach Pitikorn Khlaisamniang, Kun Kerdthaisong, Supasate Vorathammathorn, Nutchanon Yongsatianchot, Hirunkul Phimsiri, Amrest Chinkamol, Teermade Thitseesaeng, Kanyakorn Veerakanjana, Kaisorn Kachai, Piyalitt Ittichaiwong, Tossaporn Saengja
PDF
Decoupling Identity Confounders for Enhanced Facial Expression Recognition: An Information-Theoretic Approach Mohd Aquib, Nishchal K. Verma, M. Jaleel Akhtar
PDF
Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance mAP Synthesis Martina Pastorino, Michael Alibani, Nicola Acito, Gabriele Moser
PDF
Defending Against Frequency-Based Attacks with Diffusion Models Fatemeh Amerehi, Patrick Healy
PDF
Defending Against Transfer-Based Adversarial Attacks Using SVD-Driven Feature Evolution Xinlei Liu, Tao Hu, Peng Yi, Qingtao Pan, Hailong Ma, Yiming Jiang, Baolin Li
PDF
Define, Refine, Align: Correspondence-Free 3D Line Alignment with Attentional, Equivariant and Rotational Layers Alberto Pepe, Yuxin Yao, Joan Lasenby
PDF
DEFT-VTON: Efficient Virtual Try-on with Consistent Generalised H-Transform Xingzi Xu, Qi Li, Shuwen Qiu, Julien Han, Karim Bouyarmane
PDF
Defurnishing with X-Ray Vision: Joint Removal of Furniture from Panoramas and Mesh Alan Dolhasz, Chen Ma, Dave Gausebeck, Kevin Chen, Gregor Miller, Lucas Hayne, Gunnar Hovden, Azwad Sabik, Olaf Brandt, Mira Slavcheva
PDF
DELTA: Dense Depth from Events and LiDAR Using Transformer's Attention Vincent Brebion, Julien Moreau, Franck Davoine
PDF
Demo : Point-Feature Tracking for Pixel Processor Arrays Laurie Bose, Piotr Dudek
PDF
Detect, Classify, Act: Categorizing Industrial Anomalies with Multi-Modal Large Language Models Sassan Mokhtar, Arian Mousakhan, Silvio Galesso, Jawad Tayyub, Thomas Brox
PDF
Detecting Localized Deepfake Manipulations Using Action Unit-Guided Video Representations Tharun Anand, Siva Sankar, Pravin Nair
PDF
Detecting Looted Archaeological Sites from Satellite Image Time Series Elliot Vincent, Mehraïl Saroufim, Jonathan Chemla, Yves Ubelmann, Philippe Marquis, Jean Ponce, Mathieu Aubry
PDF
Detection and Localization of Drones and UAVs Using Sound and Vision Erik Tegler, Max Modig, Per Skarin, Kalle Åström, Magnus Oskarsson, Gabrielle Flood
PDF
Detector-Free Image Matching with Lightweight Backbone and Feature Filtering Xiaolong Guo, Min Wang, Hui Wu, Wengang Zhou, Houqiang Li
PDF
Diffusion-Based Continuous Sign Language Generation with Cluster-Specific Fine-Tuning and Motion-Adapted Transformer Razieh Rastgoo, Kourosh Kiani, Sergio Escalera
PDF
Direction-Aware Hybrid Representation Learning for 3D Hand Pose and Shape Estimation Shiyong Liu, Zhihao Li, Xiao Tang, Jianzhuang Liu
PDF
Disentangling Polysemantic Channels in Convolutional Neural Networks Robin Hesse, Jonas Fischer, Simone Schaub-Meyer, Stefan Roth
PDF
Disentangling Visual Transformers: Patch-Level Interpretability for Image Classification Guillaume Jeanneret, Loïc Simon, Frédéric Jurie
PDF
Dist-Tracker: A Small Object-Aware Detector and Tracker for UAV Tracking Wenzhen Wang, Jing Fu, Jiayi Song, Kaiyu Li, Hui Qiao, Jiang Liu, Hao Sun, Xiangyong Cao
PDF
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution Xinning Chai, Yao Zhang, Yuxuan Zhang, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song
PDF
Distilling Normalizing Flows Steven Walton, Valeriy Klyukin, Maksim Artemev, Denis Derkach, Nikita Orlov, Humphrey Shi
PDF
Distribution Shifts at Scale: Out-of-Distribution Detection in Earth Observation Burak Ekim, Girmaw Abebe Tadesse, Caleb Robinson, Gilles Quentin Hacheme, Michael Schmitt, Rahul Dodhia, Juan M. Lavista Ferres
PDF
DLST: Dual-Template Co-Evolution Learning for Robust Long-Term Drone Tracking in Dynamic Environments Jiahao Zhang, Yixin Wei, Jinli Zhang, Zongli Jiang, Peiwen Yu, Yufei Ma, Runan Jin
PDF
Document Image Rectification Using Stable Diffusion Transformer Pooja Kumari, Sukhendu Das
PDF
Domain Adaptation for Skin Lesion: Evaluating Real-World Generalisation Nurjahan Sultana, Wenqi Lu, Xinqi Fan, Moi Hoon Yap
PDF
Domain Adaptation of VLM for Soccer Video Understanding Tiancheng Jiang, Henry Wang, Md Sirajus Salekin, Parmida Atighehchian, Shinan Zhang
PDF
Domain Generalization Through Attenuation of Domain-Specific Information Reiji Saito, Kazuhiro Hotta
PDF
Drive4C: A Closed-Loop Benchmark on What Foundation Models Really Need to Be Capable of for Language-Guided Autonomous Driving Tin Stribor Sohn, Maximilian Dillitzer, Johannes Bach, Jason J. Corso, Tim Brühl, Robin Schwager, Tim Dieter Eberhardt, Eric Sax
PDF
Drug Discovery Agent: An Automated Vision Detection System for Drug-Cell Interactions Adib Bazgir, Yuwen Zhang
PDF
Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference Tomer Gafni, Asaf Karnieli, Yair Hanani
PDF
Dual-Input Frequency-Aware Network for High-Quality Thermal Image Super-Resolution Priya Kansal, Sabari Nathan
PDF
Dual-Path Enhancements in Event-Based Eye Tracking: Augmented Robustness and Adaptive Temporal Modeling Hoang M. Truong, Vinh-Thuan Ly, Huy G. Tran, Thuan-Phat Nguyen, Tram T. Doan
PDF
Dual-Stage Cross-Modal Network with Dynamic Feature Fusion for Emotional Mimicry Intensity Estimation Jun Yu, Lingsi Zhu, Yanjun Chi, Yunxiang Zhang, Yang Zhen, Yongqi Wang, Xilong Lu
PDF
DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection Zhe Huang, Yizhe Zhao, Hao Xiao, Chenyan Wu, Lingting Ge
PDF
Dust to Detail: Restoring Sand-Dust Images with Frequency-Guided Attention and Multi-Scale Features Romala Mishra, Sobhan Kanti Dhara
PDF
Dyadic Mamba: Long-Term Dyadic Human Motion Synthesis Julian Tanke, Takashi Shibuya, Kengo Uchida, Koichi Saito, Yuki Mitsufuji
PDF
Dynamic EventNeRF: Reconstructing General Dynamic Scenes from Multi-View RGB and Event Streams Viktor Rudnev, Gereon Fox, Mohamed Elgharib, Christian Theobalt, Vladislav Golyanik
PDF
Dynamic State-Control Modeling for Generalized Remote Sensing Image Super-Resolution Chenyu Li, Zhaojie Pan, Danfeng Hong
PDF
Dynamic Watermarks in Images Generated by Diffusion Models Yunzhuo Chen, Jordan Vice, Naveed Akhtar, Nur Al Hasan Haldar, Ajmal Mian
PDF
DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos Rajeev Yasarla, Shizhong Han, Hong Cai, Fatih Porikli
PDF
E-BARF: Bundle Adjusting Neural Radiance Fields from a Moving Event Camera Zhipeng Tang, Shifan Zhu, Zezhou Cheng, Donghyun Kim, Erik G. Learned-Miller
PDF
E-VLC: A Real-World Dataset for Event-Based Visible Light Communication and Localization Shintaro Shiba, Quan Kong, Norimasa Kobori
PDF
ECO-AI - Energy-Conscious Optimization for AI Training János Horváth
PDF
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia Valérie Zermatten, Javiera Castillo-Navarro, Pallavi Jain, Devis Tuia, Diego Marcos
PDF
Effectiveness of Max-Pooling for Fine-Tuning CLIP on Videos Fatimah Zohra, Chen Zhao, Shuming Liu, Bernard Ghanem
PDF
Effectiveness of Training with Procedurally Generated Synthetic Images of Crop Plants Nazifa Azam Khan, Mikolaj Cieslak, Mark G. Eramian, Ian McQuillan
PDF
Efficient 2D to Full 3D Human Pose Uplifting Including Joint Rotations Katja Ludwig, Yuliia Oksymets, Robin Schön, Daniel Kienzle, Rainer Lienhart
PDF
Efficient Burst Super-Resolution with One-Step Diffusion Kento Kawai, Takeru Oba, Kyotaro Tokoro, Kazutoshi Akita, Norimichi Ukita
PDF
Efficient Image Generation with Variadic Attention Heads Steven Walton, Ali Hassani, Xingqian Xu, Zhangyang Wang, Humphrey Shi
PDF
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation Thomas Kerdreux, Alexandre Tuel, Quentin Febvre, Alexis Mouche, Bertrand Chapron
PDF
Efficient Task-Specific Conditional Diffusion Policies: Shortcut Model Acceleration and SO(3) Optimization Haiyong Yu, Yanqiong Jin, Yonghao He, Wei Sui
PDF
Efficient VideoMAE via Temporal Progressive Training Xianhang Li, Peng Wang, Xinyu Li, Heng Wang, Hongru Zhu, Cihang Xie
PDF
Efficiently Mitigating Video Content Misalignment on Large Vision Model with Time-Series Data Alignment Hanchen Xie, Rose Ma, Jiageng Zhu, Zheda Mai, Wael Abd-Almageed, Zubin Abraham
PDF
Egocentric Event-Based Vision for Ping Pong Ball Trajectory Prediction Ivan Alberico, Marco Cannici, Giovanni Cioffi, Davide Scaramuzza
PDF
EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference Prakhar Kaushik, Ankit Vaidya, Shravan Chaudhari, Alan L. Yuille
EL-Attack: Explicit and Latent Space Hybrid Optimization Based General and Effective Attack for Autonomous Driving Trajectory Prediction Xuesong Bai, Changhang Tian, Wei Xia, Zhenshu Ma, Haiyang Yu, Yilong Ren
PDF
Embedding Shift Dissection on CLIP: Effects of Augmentations on VLM's Representation Learning Ashim Dahal, Saydul Akbar Murad, Nick Rahimi
PDF
Emotions in LatAm: A New Dataset and Benchmark for Emotion Recognition in Latin America Pooja Kishore Kumar, Willams de Lima Costa, Renato Nogueira Ferraz e Oliveira, Veronica Teichrieb, Estefania Talavera Martínez
PDF
EmoVLM-KD: Fusing Distilled Expertise with Vision-Language Models for Visual Emotion Analysis SangEun Lee, Yubeen Lee, Eunil Park
PDF
Enforcing View-Consistency in Class-Agnostic 3D Segmentation Fields Corentin Dumery, Aoxiang Fan, Ren Li, Nicolas Talabot, Pascal Fua
PDF
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection Jiancheng Pan, Yanxing Liu, Xiao He, Long Peng, Jiahao Li, Yuze Sun, Xiaomeng Huang
PDF
Enhanced Multi-View Pedestrian Detection Using Probabilistic Occupancy Volume Reef Alturki, Adrian Hilton, Jean-Yves Guillemaut
PDF
Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution Yiwen Wang, Ying Liang, Yuxuan Zhang, Xinning Chai, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song
PDF
Enhancing Facial Expression Recognition with LSTM Through Dual-Direction Attention Mixed Feature Networks and CLIP Josep Cabacas-Maso, Elena Ortega-Beltrán, Ismael Benito-Altamirano, Carles Ventura
PDF
Enhancing Few-Shot Class-Incremental Learning via Frozen Feature Augmentation Shimou Ling, Shengkai Gan, Caoxin Wang, Lili Pan, Hongliang Li
PDF
Enhancing Multi-Modal Automatic Target Recognition Using Out-of-Distribution Exploitation (MATRODE) Hongzhi Guo, Paul T. Schrader, Erik Blasch
PDF
Enhancing Vision Transformer Explainability Using Artificial Astrocytes Nicolas Echevarrieta-Catalan, Ana Ribas-Rodriguez, Francisco Cedron, Odelia Schwartz, Vanessa Aguiar-Pulido
PDF
ePBR: Extended PBR Materials in Image Synthesis Yu Guo, Zhiqiang Lao, Xiyun Song, Yubin Zhou, Zongfang Lin, Heather Yu
PDF
EV-Flying: An Event-Based Dataset for In-the-Wild Recognition of Flying Objects Gabriele Magrini, Federico Becattini, Giovanni Colombo, Pietro Pala
PDF
EV-LayerSegNet: Self-Supervised Motion Segmentation Using Event Cameras Youssef Farah, Federico Paredes-Vallés, Guido de Croon, Muhammad Ahmed Humais, Hussain M. Sajwani, Yahya H. Zweiri
PDF
EvenFormer: Dynamic Even Transformer for Real-World Image Restoration Xin Lu, Yuanfei Bao, Jiarong Yang, Anya Hu, Jie Xiao, Kunyu Wang, Dong Li, Senyan Xu, Kean Liu, Xueyang Fu, Zheng-Jun Zha
PDF
Event Quality Score (EQS): Assessing the Realism of Simulated Event Camera Streams via Distance in Latent Space Kaustav Chanda, Aayush Atul Verma, Arpitsinh Vaghela, Yezhou Yang, Bharatesh Chakravarthi
PDF
Event-Based Continuous Color Video Decompression from Single Frames Ziyun Wang, Friedhelm Hamann, Kenneth Chaney, Wen Jiang, Guillermo Gallego, Kostas Daniilidis
PDF
Event-Based Eye Tracking. Even-Based Vision Workshop 2025 Qinyu Chen, Chang Gao, Min Liu, Daniele Perrone, Yan Ru Pei, Zuowen Wang, Zhuo Zou, Shihang Tan, Tao Han, Guorui Lu, Zhen Xu, Junyuan Ding, Ziteng Wang, Zongwei Wu, Han Han, Yuliang Wu, Jinze Chen, Wei Zhai, Yang Cao, Zhengjun Zha, Nuwan Bandara, Thivya Kandappu, Archan Misra, Xiaopeng Lin, Hongxiang Huang, Hongwei Ren, Bojun Cheng, Hoang M. Truong, Vinh-Thuan Ly, Huy G. Tran, Thuan-Phat Nguyen, Tram T. Doan
PDF
Event-Based Tracking and Imaging of Randomly Moving Objects in Dense Dynamical Scattering Media Ning Zhang, Timothy Shea, Arto V. Nurmikko
PDF
Event-Conditioned Dual-Modal Fusion for Motion Deblurring Kean Liu, Mingchen Zhong, Senyan Xu, Zhijing Sun, Jiaying Zhu, Chengjie Ge, Xingbo Wang, Xin Lu, Xueyang Fu, Zheng-Jun Zha
PDF
Event-Driven Dynamic Attention for Multi-Object Tracking on Neuromorphic Hardware Muhammad Aitsam, Sergio Davies, Alessandro G. Di Nuovo
PDF
ExaM: Unsupervised Concept-Based Representation Learning to Better Explain Models in Vision Tasks Maguelonne Heritier, Djebril Mekhazni, Cédric Leblond-Ménard, Benoit Godbout, Nathan Guilbaud, Mahdi Alehdaghi, Eric Granger
PDF
Exemplar Masking for Multimodal Incremental Learning Yi-Lun Lee, Chen-Yu Lee, Wei-Chen Chiu, Yi-Hsuan Tsai
PDF
Expanded SPAN for Efficient Super-Resolution Qing Wang, Yang Wang, Hongyu An, Yi Liu, Liou Zhang, Shijie Zhao
PDF
Explainable Physical PolSAR Autoencoders for Soil Moisture Estimation Nikita Basargin, Alberto Alonso-González, Irena Hajnsek
PDF
Explaining 3D Point Cloud Semantic Segmentation Models Through Adversarial Attacks Jorge Francisco Ciprián-Sánchez, Josafat-Mattias Burmeister, Rico Richter, Jürgen Döllner
PDF
Exploiting Adversarial Learning and Topology Augmentation for Open-Set Visual Recognition Rosa Zuccarà, Georgia Fargetta, Alessandro Ortis, Sebastiano Battiato
PDF
Exploring Missing Modality in Multimodal Egocentric Datasets Merey Ramazanova, Alejandro Pardo, Humam Alwassel, Bernard Ghanem
PDF
Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation Johannes Spöcklberger, Wei Lin, Pedro Hermosilla, Sivan Doveh, Horst Possegger, Muhammad Jehanzeb Mirza
PDF
Exploring Semi-Supervised Learning for Online Mapping Adam Lilja, Erik Wallin, Junsheng Fu, Lars Hammarstrand
PDF
Exploring Temporal Dynamics in Event-Based Eye Tracker Hongxiang Huang, Xiaopeng Lin, Hongwei Ren, Yue Zhou, Bojun Cheng
PDF
Extra-Lightweight AI-Based Privacy Preserving Framework for Egocentric Wearable Cameras Long Li, Fengqing Zhu, Heather A. Eicher-Miller, J. Graham Thomas, Yuning Huang, Edward Sazonov
PDF
Eyes Tell the Truth: GazeVal Highlights Shortcomings of Generative AI in Medical Imaging David C. Wong, Bin Wang, Gorkem Durak, Marouane Tliba, Akshay Chaudhari, Aladine Chetouani, Ahmet Enis Çetin, Cagdas Topel, Nicolo Gennaro, Camila Lopes Vendrami, Tugce Agirlar Trabzonlu, Amir Ali Rahsepar, Laetitia Perronne, Matthew Antalek, Onural Ozturk, Gokcan Okur, Andrew C. Gordon, Ayis Pyrros, Frank H. Miller, Amir Borhani, Hatice Savas, Eric M. Hart, Drew A. Torigian, Jayaram K. Udupa, Elizabeth A. Krupinski, Ulas Bagci
PDF
Face Reconstruction from Face Embeddings Using Adapter to a Face Foundation Model Hatef Otroshi-Shahreza, Anjith George, Sébastien Marcel
PDF
FaceGest: A Comprehensive Facial Gesture Dataset for Human-Computer Interaction Yaseen, Sonain Jamil
PDF
Fairness-Aware Boosting Model for Imbalanced 3D Point Cloud Segmentation in Autonomous Driving Elahe Yahyapour, Chengbo Ai
PDF
FALCON: Fast Image Haze Removal Leveraging Continuous Density Mask Donghyun Kim, Seil Kang, Seong Jae Hwang
PDF
Fast Sphericity and Roundness Approximation in 2D and 3D Using Local Thickness Pawel Tomasz Pieta, Peter Winkel Rasmussen, Anders Bjorholm Dahl, Anders Nymark Christensen
PDF
FCTFANet: A Fused CNN-Transformer Feature Aggregator Network for Image Restoration Amit Monga, Hemkant Nehete, Partha Kaushik, Tharun Kumar Reddy Bollu, Balasubramanian Raman, Gaurav Sharma
PDF
Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection YeongHyeon Park, Sungho Kang, Myung Jin Kim, Hyeong Seok Kim, Juneho Yi
PDF
Feature Matching in the Dark: Homography-Based RGB-IR Feature Transformation for Low-Light Vision Kyle O'Donnell, Chandra Kambhamettu
PDF
FedAlign: Federated Domain Generalization with Cross-Client Feature Alignment Sunny Gupta, Vinay Sutar, Varunav Singh, Amit Sethi
PDF
FedCAPR: Federated Camera-Aware Unsupervised Person Re-Identification with Identity-Distributed Equalization for Decentralized Data Clustering Yu-Syuan Tseng, Tzu-Chin Hsu, Chih-Ting Liu, Shao-Yi Chien
PDF
FedCIAL: Federated Color-Invariant Adversarial Learning for Enhancing Fairness and Performance in Skin Lesion Classification Rahmat Izwan Heroza, John Q. Gan, Haider Raza
PDF
FedDG-MoE: Test-Time Mixture-of-Experts Fusion for Federated Domain Generalization Ahmed Radwan, Mahmoud Soliman, Omar Abdelaziz, Mohamed Shehata
PDF
FedSECA: Sign Election and Coordinate-Wise Aggregation of Gradients for Byzantine Tolerant Federated Learning Joseph Geo Benjamin, Mothilal Asokan, Mohammad Yaqub, Karthik Nandakumar
PDF
Few-Shot Adaptation of Grounding DINO for Agricultural Domain Rajhans Singh, Rafael Bidese-Puhl, Kshitiz Dhakal, Sudhir Sornapudi
PDF
FieldMOT: A Field-Registered Multi-Object Tracking for Sports Videos Hong-Qi Chen, Chao-Chi Liao, Yuan-Heng Sun, Cheng-Kuan Lin, Yu-Chee Tseng
PDF
Fine-Grained Few-Shot Classification with Part Matching Samuel Black, Richard Souvenir
PDF
FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment Ruisheng Han, Kanglei Zhou, Amir Atapour-Abarghouei, Xiaohui Liang, Hubert P. H. Shum
PDF
FLAR-SVD: Fast and Latency-Aware Singular Value Decomposition for Model Compression Moritz Thoma, Jorge Villasante, Emad Aghajanzadeh, Shambhavi Balamuthu Sampath, Pierpaolo Morì, Maximilian Groetzinger, Daniil Dylkin, Manoj Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, Daniel Mueller-Gritschneder, Ulf Schlichtmann
PDF
Flow-Guided Deformable Alignment with Channel-Wise Self-Attention Reconstruct for Efficient Burst HDR Restoration Weiyu Zhou, Tao Hu, Yixu Feng, Duwei Dai, Yu Cao, Peng Wu, Wei Dong, Yanning Zhang, Qingsen Yan
PDF
FM-LoRA: Factorized Low-Rank Meta-Prompting for Continual Learning Xiaobing Yu, Jin Yang, Xiao Wu, Peijie Qiu, Xiaofeng Liu
PDF
Food Degradation Analysis Using Multimodal Fuzzy Clustering Julio J. Valdés, Stephie Liu, Shawn Yang, Yuhao Chen, Alexander Wong, Pengcheng Xi
PDF
FoodVideoQA: A Novel Baseline Framework for Dietary Monitoring Krish Shah, Siddharth Viswanath, Pengcheng Xi, Alexander Wong, Yuhao Chen
PDF
ForesightNav: Learning Scene Imagination for Efficient Exploration Hardik Shah, Jiaxu Xing, Nico Messikommer, Boyang Sun, Marc Pollefeys, Davide Scaramuzza
PDF
Forget Less, Learn More: Contrastive-Based Federated Class Incremental Learning with a Low-Dimensional Projection Layer Ensieh Khazaei, Dimitrios Hatzinakos
PDF
Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization Darryl Hannan, John Cooper, Dylan White, Timothy Doster, Henry Kvinge, Yijing Watkins
PDF
FreBIS: Frequency-Based Stratification for Neural Implicit Surface Representations Naoko Sawada, Pedro Miraldo, Suhas Lohit, Tim K. Marks, Moitreya Chatterjee
PDF
Frequency-Prior Enhanced Ambient Lighting Normalization via Visual Perceptual Refinement Yuanfei Bao, Xin Lu, Xingbo Wang, Jiarong Yang, Anya Hu, Kunyu Wang, Jie Xiao, Dong Li, Xueyang Fu, Zheng-Jun Zha
PDF
FrogDogNet: Fourier Frequency Retained Visual Prompt Output Guidance for Domain Generalization of CLIP in Remote Sensing Hariseetharam Gunduboina, Muhammad Haris Khan, Biplab Banerjee
PDF
From Beats to Scores: A Multi-Modal Framework for Comprehensive Figure Skating Assessment Fengshun Wang, Qiurui Wang, Dan Chen
PDF
From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction Vladimir Golovkin, Nikolay Nemtsev, Vasyl Shandyba, Oleg Udin, Nikita Kasatkin, Pavel Kononov, Anton Afanasiev, Sergey Ulasen, Andrei Boiarov
PDF
From Data to Design: Leveraging Frequency Statistics for Efficient Neural Network Architectures Mustafa Munir, Guihong Li, Md Mostafijur Rahman, Alex Zhang, Radu Marculescu
PDF
FullCycle: Full Stage Adversarial Attack for Reinforcement Learning Robustness Evaluation Zhenshu Ma, Xuan Cai, Changhang Tian, Yuqi Fan, Kemou Jiang, Gangfu Liu, Xuesong Bai, Aoyong Li, Yilong Ren, Haiyang Yu
PDF
FungiTastic: A Multi-Modal Dataset and Benchmark for Image Categorization Lukás Picek, Klára Janousková, Vojtech Cermák, Jiri Matas
PDF
FusedVision: A Knowledge-Infusing Approach for Practical Anomaly Detection in Real-World Surveillance Videos Khaled Waleed Dawoud, Zaigham Zaheer, Mustaqeem Khan, Karthik Nandakumar, Abdulmotaleb Elsaddik, Muhammad Haris Khan
PDF
Fusion or Confusion? a Look at Dataset Pooling for Infrared Object Detection Stefan Becker, Ann-Kristin Grosselfinger, Jens Bayer, David Münch, Wolfgang Hübner, Michael Arens
PDF
FUSION: Frequency-Guided Underwater Spatial Image recOnstructioN Jaskaran Singh Walia, Shravan Venkatraman, Pavithra L. K.
PDF
FusionNet: Multi-Model Linear Fusion Framework for Low-Light Image Enhancement Kangbiao Shi, Yixu Feng, Tao Hu, Yu Cao, Peng Wu, Yijin Liang, Yanning Zhang, Qingsen Yan
PDF
G-Buffer Supported Neural Screen-Space Refraction Baking for Real-Time Global Illumination Ziyang Zhang, Edgar Simo-Serra
PDF
GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting Inseo Lee, Youngyoon Choi, Joonseok Lee
PDF
Generalizable Unsupervised Microscopy Video Denoising via Weighted SpatioTemporal Sampling Mary Damilola Aiyetigbo, Wanqi Yuan, Feng Luo, Xin Li, Tong Ye, Nianyi Li
PDF
Generative AI for Film Creation: A Survey of Recent Advances Ruihan Zhang, Borou Yu, Jiajian Min, Yetong Xin, Zheng Wei, Juncheng Nemo Shi, Mingzhen Huang, Xianghao Kong, Nix Liu Xin, Shanshan Jiang, Praagya Bahuguna, Mark Chan, Khushi Hora, Lijian Yang, Yongqi Liang, Runhe Bian, Yunlei Liu, Isabela Campillo Valencia, Patricia Morales Tredinick, Ilia Kozlov, Sijia Jiang, Peiwen Huang, Na Chen, Xuanxuan Liu, Anyi Rao
PDF
Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models Josef Bengtson, David Nilsson, Fredrik Kahl
PDF
Geometry-Aware Texture Generation for 3D Head Modeling with Artist-Driven Control Amin Fadaeinejad, Abdallah Dib, Luiz Gustavo Hafemann, Emeline Got, Trevor Anderson, Amaury Depierre, Nikolaus F. Troje, Marcus A. Brubaker, Marc-André Carbonneau
PDF
Get a GRIP on Test Time Adaptation! - Group Robust Inference-Time Policy Optimization for Vision Models Prabhav Sanga, Jaskaran Singh, Tapabrata Chakraborti
PDF
gMINT: Gradiant-Based Membership Inference Test Applied to Image Models Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana
PDF
Goal-Driven Human Motion Synthesis in Diverse Task Inwoo Hwang, Jinseok Bae, Donggeun Lim, Young Min Kim
PDF
Good4cir: Generating Detailed Synthetic Captions for Composed Image Retrieval Pranavi Kolouju, Eric Xing, Robert Pless, Nathan Jacobs, Abby Stylianou
PDF
GPT-FL: Generative Pre-Trained Model-Assisted Federated Learning Tuo Zhang, Tiantian Feng, Samiul Alam, Dimitrios Dimitriadis, Sunwoo Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr
PDF
GRS: Generating Robotic Simulation Tasks from Real-World Images Alex Zook, Fan-Yun Sun, Josef B. Spjut, Valts Blukis, Stan Birchfield, Jonathan Tremblay
PDF
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Lorenza Prospero, Abdullah Hamdi, João F. Henriques, Christian Rupprecht
PDF
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering Alexander Rusnak, Frédéric Kaplan
PDF
Harmonizing Attention Fields with Knowledge Distillation for Multi-View 3D Object Detection Yafei Qi, Menghao Yang, Fan Wu, Chen Wang, Yongmin Zhang
PDF
HARMONY: Hidden Activation Representations and Model Output-Aware Uncertainty Estimation for Vision-Language Models Erum Mushtaq, Zalan Fabian, Yavuz Faruk Bakman, Anil Ramakrishna, Mahdi Soltanolkotabi, Salman Avestimehr
PDF
HCS-DFC: A Diffusion Classifier for Mode of Action Prediction Using Morphological Profiles Jakub Kosciukiewicz, Dawid Rymarczyk, Bartosz Zielinski
PDF
HDC: Hierarchical Distillation for Multi-Level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation Tran Quoc Khanh Le, Nguyen Lan Vi Vu, Ha-Hieu Pham, Xuan-Loc Huynh, Tien-Huy Nguyen, Minh Huu Nhat Le, Quan Nguyen, Hien D. Nguyen
PDF
HeAL3D: Heuristical-Enhanced Active Learning for 3D Object Detection Esteban Rivera, Surya Prabhakaran, Markus Lienkamp
PDF
Hierarchical Semantic Segmentation with Autoregressive Language Modeling Josh Myers-Dean, Brian L. Price, Yifei Fan, Danna Gurari
PDF
HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions Using Diffusion Models Xiaogang Peng, Yiming Xie, Zizhao Wu, Varun Jampani, Deqing Sun, Huaizu Jiang
PDF
HopNet: Harmonizing Object Placement Network for Realistic Image Generation via Object Composition Matthew Poska, Sharon X. Huang, Bin Hwang
PDF
How Does the Machine Perceive Depth for Indoor Single Images with CNN? Yihong Wu, Yuwen Heng, Mahesan Niranjan, Hansung Kim
PDF
How Good Is My Video-LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Jameel Hassan, Muzammal Naseer, Federico Tombari, Fahad Shahbaz Khan, Salman Khan
PDF
How Much Noise Is There in Labels Generated by Humans? a Method to Validate Automatically Generated Bounding Boxes Mariusz Karol Nowak, Jacek Cyranka, Natalia Maslany, Aleksander Kostuch, Jakub Derbisz, Mateusz Komorkiewicz, Patryk Siwek, Mateusz Wójcik, Dariusz Marchewka, Pawel Skruch
PDF
Human Mesh Reconstruction of Sports Players with Multiple Dynamic Cameras Yamato Hokari, Ryosuke Hori, Hideo Saito
PDF
Human vs. Machine Minds: Ego-Centric Action Recognition Compared Sadegh Rahmani-Boldaji, Filip Rybansky, Quoc Vuong, Frank Guerin, Andrew Gilbert
PDF
Human-Robot Navigation Using Event-Based Cameras and Reinforcement Learning Ignacio G. Bugueño-Córdova, Javier Ruiz-del-Solar, Rodrigo Verschae
PDF
HumMorph: Generalized Dynamic Human Neural Fields from Few Views Jakub Zadrozny, Hakan Bilen
PDF
Hybrid AI-Physical Modeling for Penetration Bias Correction in X-Band InSAR DEMs: A Greenland Case Study Islam Mansour, Georg Fischer, Ronny Hänsch, Irena Hajnsek
PDF
IAUNet: Instance-Aware U-Net Yaroslav Prytula, Illia Tsiporenko, Ali Zeynalli, Dmytro Fishman
PDF
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding Lanyun Zhu, Deyi Ji, Tianrun Chen, Peng Xu, Jieping Ye, Jun Liu
PDF
Ice Hockey Puck Localization Using Contextual Cues Liam Salass, Jerrin Bright, Amir Nazemi, Yuhao Chen, John S. Zelek, David A. Clausi
PDF
ICT-QA: Question Answering over Multi-Modal Contexts Including Image, Chart, and Text Modalities Youngrok Jang, Hyesoo Kong, Gyeonghun Kim, Yejin Lee, Stanley Jungkyu Choi, Kyunghoon Bae
PDF
IdolDanceNet: Indian Heritage Idol Dance Pose Classification Kanimozhi Soundararajan, Sabari Nathan, A. Sasithradevi
PDF
IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework Under Limited Annotation Scheme Quan Tran, Hoang-Thien Nguyen, Thanh-Huy Nguyen, Gia-Van To, Tien-Huy Nguyen, Quan Nguyen
PDF
IL-NeRF: Incremental Learning for Neural Radiance Fields with Camera Pose Alignment Letian Zhang, Ming Li, Chen Chen, Jie Xu
PDF
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions Mohammadmostafa Rostamkhani, Baktash Ansari, Hoorieh Sabzevari, Farzan Rahmani, Sauleh Eetemadi
PDF
IMC: A Benchmark for Invariant Learning Under Multiple Causes Taero Kim, Seonggyun Lee, Joonseong Kang, Youngjun Choi, Wonsang Yun, Nicole Hee-Yeon Kim, Ziyu Chen, Lexing Xie, Kyungwoo Song
PDF
Improved Out-of-Distribution Detection with Additive Angular Margin Loss Deepak Ravikumar, Efstathia Soufleri, Kaushik Roy
PDF
Improving Multimodal Hateful Meme Detection Exploiting LMM-Generated Knowledge Maria Tzelepi, Vasileios Mezaris
PDF
Improving Open-World Object Localization by Discovering Background Ashish Singh, Michael Jones, Kuan-Chuan Peng, Anoop Cherian, Moitreya Chatterjee, Erik G. Learned-Miller
PDF
Improving Optical Flow and Stereo Depth Estimation by Leveraging Uncertainty-Based Learning Difficulties Jisoo Jeong, Hong Cai, Jamie Menjay Lin, Fatih Porikli
PDF
Improving Weather-Based OOD Generalisation in LiDAR-Based Object Detection Models via Adversarial Training Ben Batten, Alessio Lomuscio
PDF
iNatAg: Multi-Class Classification Models Enabled by a Large-Scale Benchmark Dataset with 4.7m Images of 2, 959 Crop and Weed Species Naitik Jain, Amogh Joshi, Mason Earles
PDF
Inferring Driving Maps by Deep Learning-Based Trail mAP Extraction Michael Hubbertz, Pascal Colling, Qi Han, Tobias Meisen
PDF
Instance Feature Caching for Cross-Domain Few-Shot Object Detection Yali Huang, Jie Mei, Yiming Yang, Mi Guo, Mingyuan Jiu, Mingliang Xu
PDF
Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching Xinli Yue, Jianhui Sun, Junda Lu, Liangchao Yao, Fan Xia, Tianyi Wang, Fengyun Rao, Jing Lyu, Yuetang Deng
PDF
Interactive Multimodal Framework with Temporal Modeling for Emotion Recognition Jun Yu, Yongqi Wang, Lei Wang, Yang Zheng, Shengfan Xu
PDF
Intriguing Properties of Robust Classification Bernd Prach, Christoph H. Lampert
Investigating Mechanisms for In-Context Vision Language Binding Darshana Saravanan, Makarand Tapaswi, Vineet Gandhi
PDF
Is Multi-Person Gait Recognition Feasible Under Mutual Occlusion? a Human Model Regression-Based Approach Ziruo Li, Chi Xu, Xiang Li, Shuqiong Wu, Yasushi Yagi
PDF
Is Temporal Prompting All We Need for Limited Labeled Action Recognition? Shreyank N. Gowda, Boyan Gao, Xiao Gu, Xiao-Bo Jin
PDF
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements M. Arda Aydin, Efe Mert Çirpar, Elvin Abdinli, Gozde Unal, Yusuf Hüseyin Sahin
PDF
Iterative Event-Based Motion Segmentation by Variational Contrast Maximization Ryo Yamaki, Shintaro Shiba, Guillermo Gallego, Yoshimitsu Aoki
PDF
Jump-Aware: Player Position Rectification and Identification in Dynamic Sports Using Jump Event Spotting Yin May Oo, Ankhzaya Jamsrandorj, Vanyi Chao, Hoang Quoc Nguyen, Yewon Hwang, Kyung-Ryoul Mun, Jinwook Kim
PDF
KernFusNet: Implicit Kernel Modulation and Fusion for Blind Super-Resolution Nancy Mehta, Akshay Dudhane, Subrahmanyam Murala, Radu Timofte
PDF
Knowledge Distillation Approach for SOS Fusion Staging: Towards Fully Automated Skeletal Maturity Assessment Omid Halimi Milani, Amanda Nikho, Marouane Tliba, Lauren Mills, Ahmet Enis Çetin, Mohammed H. Elnagar
PDF
KOFFVQA: An Objectively Evaluated Free-Form VQA Benchmark for Large Vision-Language Models in the Korean Language Yoonshik Kim, Jaeyoon Jung
PDF
LADI V2: Multi-Label Dataset and Classifiers for Low-Altitude Disaster Imagery Samuel Scheele, Katherine Picchione, Jeffrey Liu
PDF
LangCoop: Collaborative Driving with Language Xiangbo Gao, Yuheng Wu, Rujia Wang, Chenxi Liu, Yang Zhou, Zhengzhong Tu
PDF
LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New Dataset Wenqi Guo, Yiyang Du, Shan Du
PDF
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation Zahra Tehraninasab, Amar Kumar, Tal Arbel
PDF
LAPIS: A Novel Dataset for Personalized Image Aesthetic Assessment Anne-Sofie Maerten, Li-Wei Chen, Stefanie De Winter, Christophe Bossens, Johan Wagemans
PDF
Latent Patched Efficient Diffusion Model for High Resolution Image Synthesis Weiyun Jiang, Devendra K. Jangid, Seok-Jun Lee, Hamid R. Sheikh
PDF
Learned Lightweight Smartphone ISP with Unpaired Data Andrei Arhire, Radu Timofte
PDF
Learned Smartphone ISP on Mobile GPUs, Mobile AI 2025 Challenge: Report Andrey Ignatov, Georgy Perevozchikov, Radu Timofte, Cheng Li, Lian Liu, Jun Cao, Heng Sun, Wu Pan, Song Wang, Keqiang Yu, Shuo Liu, Hongqin He, Zhenhao Dong, Jianke Chen, Dejun Hao, Keqiang Yu, Tingniao Wang, Xiaoqing Zhou, Dong Zhang, Chunxia Zhang, Jianguang He, Hailong Yan, Ao Li, Xiangtao Zhang, Zhe Liu, Ce Zhu, Le Zhang, Andrei Arhire, Shuo Liu, Junpyo Seo, Fen Xie, Xiuzhi Fang, Chen Wu, Zhangsheng Wang, Pengbo Zhang, Jiazi Huang
PDF
Learning from Noise: Enhancing DNNs for Event-Based Vision Through Controlled Noise Injection Marcin Kowalczyk, Kamil Jeziorek, Tomasz Kryjak
PDF
Learning Optical Flow Field via Neural Ordinary Differential Equation Leyla Mirvakhabova, Hong Cai, Jisoo Jeong, Hanno Ackermann, Farhad G. Zanjani, Fatih Porikli
PDF
Learning Pose-Aware Representations in Vision Transformers for Understanding Activities of Daily Living Dominick Reilly, Srijita Das, Srijan Das
PDF
Learning to Drive from a World Model Mitchell Goff, Greg Hogan, George Hotz, Armand du Parc Locmaria, Kacper Raczy, Harald Schäfer, Adeeb Shihadeh, Weixing Zhang, Yassine Yousfi
PDF
Less Biased Noise Scale Estimation for Threshold-Robust RANSAC Johan Edstedt
PDF
Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes Katja Ludwig, Julian Lorenz, Daniel Kienzle, Tuan Bui, Rainer Lienhart
PDF
Leveraging Fixed and Dynamic Pseudo-Labels in Cross-Supervision Framework for Semi-Supervised Medical Image Segmentation Suruchi Kumari, Pravendra Singh
PDF
Leveraging Intermediate Features of Vision Transformer for Face Anti-Spoofing Mika Feng, Koichi Ito, Takafumi Aoki, Tetsushi Ohki, Masakatsu Nishigaki
PDF
Leveraging Lightweight Facial Models and Textual Modality in Audio-Visual Emotional Understanding In-the-Wild Andrey V. Savchenko, Lyudmila V. Savchenko
PDF
Leveraging Multimodal Large Language Models for Joint Discrete and Continuous Evaluation in Text-to-Image Alignment Zhichao Zhang, Xinyue Li, Wei Sun, Zicheng Zhang, Yunhao Li, Xiaohong Liu, Guangtao Zhai
PDF
Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation Sarosij Bose, Hannah Dela Cruz, Arindam Dutta, Elena Kokkoni, Konstantinos Karydis, Amit K. Roy-Chowdhury
PDF
Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging Amar Kumar, Anita Kriz, Barak Pertzov, Tal Arbel
PDF
LFMix: A Lightweight Hybrid Architecture for Light Field Super-Resolution Mingyang Yu, Zhijian Wu, Dingjiang Huang
PDF
LFTramba: Comprehensive Information Learning for Light Field Image Super-Resolution via a Hybrid Transformer-Mamba Framework Haosong Liu, Xiancheng Zhu, Huanqiang Zeng, Jianqing Zhu, Yifan Shi, Jing Chen, Junhui Hou
PDF
LFTransMamba: A Hybrid Mamba-Transformer Model for Light Field Image Super-Resolution Kai Jin, Zeqiang Wei, Angulia Yang, Di Wu, Mingzhi Gao, Xiuzhuang Zhou
PDF
Live Demonstration: NeuroTouch - A Neuromorphic Vision-Based Tactile Sensor for Real-Time Gesture Recognition Victor Hoffmann, Valentina Cavinato, Kirk Y. W. Scheper
PDF
Live Demonstration: Real-Time Event-Data Processing with Graph Convolutional Neural Networks and SoC FPGA Piotr Wzorek, Krzysztof Blachut, Kamil Jeziorek, Tomasz Kryjak
PDF
LLaVA-SCo: Teach Vision Language Models to Self-Correct Zixuan Liu, Guangkai Jiang, Siavash H. Khajavi
PDF
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi Mahsa Ardakani, Jinendra Malekar, Ramtin Zand
PDF
LMFormer: Lane Based Motion Prediction Transformer Harsh Yadav, Maximilian Schäfer, Kun Zhao, Tobias Meisen
PDF
LNTransformer: Lung Nodule Transformer for Sparse CT Segmentation Hooman Ramezani, Charlotte Vedrines, Dionne M. Aleman, Daniel Létourneau
PDF
Location-Free Scene Graph Generation Ege Özsoy, Felix Holm, Chantal Pellegrini, Tobias Czempiel, Mahdi Saleh, Nassir Navab, Benjamin Busam
PDF
Looking into the Shadow: Recording a Total Solar Eclipse with High-Resolution Event Cameras Fernando Cladera, Kenneth Chaney, Caroline Pritchard, M. Ani Hsieh, Vijay Kumar, Camillo J. Taylor, Kostas Daniilidis
PDF
Low-Frame-Rate Cell Tracking: Unmet Needs and Future Directions Mina Gachloo, Akhila Nangineedi, Mahsa Partovi, Fardifa Fathmiul Alam, Tzu-Yu Chu, James Schvaneveldt, Xiaoming Lu, Tirthankar Biswas, Marc R. Birtwistle, Federico Iuricich
PDF
Low-Resource Video Super-Resolution Using Memory, Wavelets, and Deformable Convolutions Kavitha Viswanathan, Amit Sethi, Shashwat Pathak, Piyush Bharambe, Harsh Choudhary
PDF
LVP-CLIP: Revisiting CLIP for Continual Learning with Label Vector Pool Yue Ma, Huantao Ren, Boyu Wang, Jingang Jin, Senem Velipasalar, Qinru Qiu
PDF
M-Adaptor: Text-Driven Whole-Body Human Motion Generation Alicia Li, Xiaodong Chen, Bohao Liang, Qian Bao, Wu Liu
PDF
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU Àlex Pujol Vidal, Kamal Nasrollahi, Thomas B. Moeslund, Sergio Escalera
PDF
MAD: Makeup All-in-One with Cross-Domain Diffusion Model Bo-Kai Ruan, Hong-Han Shuai
PDF
Maize Ear Sensing for On-Farm Yield Predictions Pedro Cisdeli, Gustavo Nocera Santiago, German Mandrini, Ignacio Antonio Ciampitti
PDF
Making Every Event Count: Balancing Data Efficiency and Accuracy in Event Camera Subsampling Hesam Araghi, Jan van Gemert, Nergis Tomen
PDF
Mamba-VA: A Mamba-Based Approach for Continuous Emotion Recognition in Valence-Arousal Space Yuheng Liang, Zheyu Wang, Feng Liu, Mingzhou Liu, Yu Yao
PDF
Mapping Biodiversity at Very-High Resolution in Europe César Leblanc, Lukás Picek, Rémi Palard, Benjamin Deneu, Maximilien Servajean, Pierre Bonnet, Alexis Joly
PDF
MaskAdapt: Unsupervised Geometry-Aware Domain Adaptation Using Multimodal Contextual Learning and RGB-Depth Masking Numair Nadeem, Muhammad Hamza Asad, Saeed Anwar, Abdul Bais
PDF
MAVEN: Multi-Modal Attention for Valence-Arousal Emotion Network Vrushank Ahire, Kunal Shah, Mudasir Nazir Khan, Nikhil Pakhale, Lownish Rai Sookha, Mudasir Ahmad Ganaie, Abhinav Dhall
PDF
Maximizing Aerial Detection of Organic Objects in Non-Exhaustively Searchable Survey Area Amir Ehsan Niaraki Asli, Jansel Herrera-Gerena, Jeremy Roghair, Ali Jannesari
PDF
MDMP: Multi-Modal Diffusion for Supervised Motion Predictions with Uncertainty Leo Bringer, Joey Wilson, Kira Barton, Maani Ghaffari
PDF
MegaLoc: One Retrieval to Place Them All Gabriele Moreno Berton, Carlo Masone
PDF
MerCulture: A Comprehensive Benchmark to Evaluate Vision-Language Models on Cultural Understanding in Singapore Tushar Pranav, Eshan Pandey, Lyka Diane Bala Austria, Yin Yin Loo, Jing Hao Lim, Indriyati Atmosukarto, Donny Cheng Lock Soh
PDF
MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data Paul Borne-Pons, Mikolaj Czerkawski, Rosalie Martin, Romain Rouffet
PDF
MFSR-GAN: Multi-Frame Super-Resolution with Handheld Motion Modeling Fadeel Sher Khan, Joshua Ebenezer, Hamid R. Sheikh, Seok-Jun Lee
PDF
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model Navin Ranjan, Andreas E. Savakis
PDF
Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation Jia Wei, Xiaoqi Zhao, Jonghye Woo, Jinsong Ouyang, Georges El Fakhri, Qingyu Chen, Xiaofeng Liu
PDF
MMDrive: Multi-Modal Remote Physiological Signal Measurement Dataset for Driver Status Monitoring Jiho Choi, Sang Jun Lee
PDF
MObI: Multimodal Object Inpainting Using Diffusion Models Alexandru Buburuzan, Anuj Sharma, John Redford, Puneet K. Dokania, Romain Mueller
PDF
MoCLIP Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation Gabriel Maldonado, Armin Danesh Pazho, Ghazal Alinezhad Noghre, Vinit Katariya, Hamed Tabkhi
PDF
MoF-Image: Generating Mixture-of-Features Video Game Image Dataset via GPU Rendering Simulation Yu Wen, Xingke Yang, Aamir Bader Shah, Ruizhi Cao, Miao Pan, Chenhao Xie, Xin Fu
PDF
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Julian Tanke, Shusuke Takahashi, Yuki Mitsufuji
PDF
MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model Rajat Sahay, Andreas E. Savakis
PDF
MTA-VPS: A Large-Scale Benchmark for Video-Based Person Search Ding Qi, Shuguang Dou, Jian Liu, Huaixuan Cao, Hao Zhang, Dongsheng Jiang, Cairong Zhao
PDF
MTevent: A Multi-Task Event Camera Dataset for 6d Pose Estimation and Moving Object Detection Shrutarv Awasthi, Anas Gouda, Sven Franke, Jérôme Rutinowski, Frank Hoffmann, Moritz Roidl
PDF
Multi-Agent Systems for Robotic Autonomy with LLMs Junhong Chen, Ziqi Yang, Haoyuan G. Xu, Dandan Zhang, George P. Mylonas
PDF
Multi-Aspect Knowledge Distillation with Large Language Model Taegyeong Lee, Jinsik Bang, Soyeong Kwon, Taehwan Kim
PDF
Multi-Dimensional Quality Assessment for UGC Videos via Modular Multi-Modal Vision-Language Models Weixia Zhang, Bingkun Zheng, Junlin Chen, Zhihua Wang
PDF
Multi-Entity Video Transformers for Fine-Grained Video Representation Learning Matthew Walmer, Rose Catherine Kanjirathinkal, Kai Sheng Tai, Keyur Muzumdar, Tai-Peng Tian, Abhinav Shrivastava
PDF
Multi-Flow: Multi-View-Enriched Normalizing Flows for Industrial Anomaly Detection Mathis Kruse, Bodo Rosenhahn
PDF
Multi-Layer Radial Basis Function Networks for Out-of-Distribution Detection Amol Khanna, Chenyi Ling, Derek Everett, Edward Raff, Nathan Inkawhich
PDF
Multi-Person Physics-Based Pose Estimation for Combat Sports Hossein Feizollah Zadeh Khoiee, David R. Labbé, Thomas Romeas, Jocelyn Faubert, Sheldon Andrews
Multi-Spectral Imaging and Data Fusion for Real-Time Bleeding Detection Ghazal Rouhafzay, Stephen Rowlands, Angel J. Valencia, Shengsong Yang, Pierre Payeur, Haitao Tian, James Dickens
PDF
Multimodal 3D Object Detection on Unseen Domains Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael Jones, Vishal Patel
PDF
Multimodal Emotion Prediction in Interpersonal Videos Integrating Facial and Speech Cues Hajer Guerdelli, Claudio Ferrari, Stefano Berretti, Alberto Del Bimbo
PDF
Multimodal Generalized Category Discovery Yuchang Su, Renping Zhou, Siyu Huang, Xingjian Li, Tianyang Wang, Ziyue Wang, Min Xu
PDF
Multimodal Rationales for Explainable Visual Question Answering Kun Li, George Vosselman, Michael Ying Yang
PDF
Multiple Instance Learning for Visual Grain Quality Analysis Without Instance-Level Annotation Bradley Ezard, Ling Li, Senjian An
PDF
MVCM: Enhancing Multi-View and Cross-Modality Alignment for Medical Visual Question Answering and Medical Image-Text Retrieval Yuanhao Zou, Zhaozheng Yin
PDF
NadirFloorNet: Reconstructing Multi-Room Floorplans from a Small Set of Registered Panoramic Images Giovanni Pintore, Uzair Shah, Marco Agus, Enrico Gobbetti
PDF
Nanoparticle Diameter Measurements with Event Camera Tracking Michael C. Daugherty, Matthew DiSalvo, Aaron Goldfain, Alexander Peterson, Edward Kwee, Thomas Germer, Gregory Cooksey, Jagat Budhathoki, Peter Bajcsy
PDF
Naturally Computed Scale Invariance in the Residual Stream of ResNet18 André Longon
PDF
Near-Incident Detection in Railroad Environments: Lateral Distance Estimation from Train-Mounted Monocular Camera Yilei Wang, Giacomo D'Amicantonio, Egor Bondarev
PDF
Neighbor-Based Feature and Index Enhancement for Person Re-Identification Chao Yuan, Tianyi Zhang, Guanglin Niu
PDF
NeIn: Telling What You Don't Want Nhat-Tan Bui, Dinh-Hieu Hoang, Quoc-Huy Trinh, Minh-Triet Tran, Truong Nguyen, Susan Gauch
PDF
NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds Mahan Rafidashti, Ji Lan, Maryam Fatemi, Junsheng Fu, Lars Hammarstrand, Lennart Svensson
PDF
Nexar Dashcam Collision Prediction Dataset and Challenge Daniel C. Moura, Shizhan Zhu, Orly Zvitia
PDF
NExNet Seg: Neuron Expansion Network for Medical Image Segmentation Abel A. Reyes Angulo, Sidike Paheding
PDF
No Train yet Gain: Towards Generic Multi-Object Tracking in Sports and Beyond Tomasz Stanczyk, Seongro Yoon, François Brémond
PDF
No-MambAAD: Revitalizing Conv-Only Networks for Unsupervised Anomaly Detection Masud An Nur Islam Fahim, Jani Boutellier
PDF
Noise Consistency Regularization for Improved Subject-Driven Image Synthesis Yao Ni, Song Wen, Piotr Koniusz, Anoop Cherian
PDF
NTIRE 2025 Ambient Lighting Normalization Challenge Report Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Zongwei Wu, Radu Timofte, Yuanfei Bao, Xingbo Wang, Xin Lu, Jiarong Yang, Anya Hu, Kunyu Wang, Jie Xiao, Dong Li, Xueyang Fu, Zheng-Jun Zha, Zihao Fan, Xi Wang, Yurui Zhu, Kean Liu, Senyan Xu, Hongjian Liu, Yupeng Xiao, David Serrano-Lozano, Francisco A. Molina-Bakhos, Danna Xue, Yixiong Yang, Maria Pilligua, Ramon Baldrich, María Vanrell, Javier Vazquez-Corral, Xuan Sun, Zijie Lou, Ting Liu, Kuldeep Purohit, Jameer Babu Pinjari, Yilin Zhang, Huan Zheng, Yanyan Wei, Suiyi Zhao, Shengeng Tang, Zhao Zhang, Yushen Zuo, Zongqi He, Zhe Xiao, Cuixin Yang, Rongkang Dong, Jun Xiao, Kin-Man Lam, Nikhil Akalwadi, Vijayalaxmi Ashok Aralikatti, Dheeraj Damodhar Hegde, Ramesh Ashok Tabib, Uma Mudenagudi, Anas M. Ali, Bilel Benjdira, Wadii Boulila
PDF
NTIRE 2025 Challenge on Cross-Domain Few-Shot Object Detection: Methods and Results Yuqian Fu, Xingyu Qiu, Bin Ren, Yanwei Fu, Radu Timofte, Nicu Sebe, Ming-Hsuan Yang, Luc Van Gool, Kaijin Zhang, Qingpeng Nong, Xiugang Dong, Hong Gao, Xiangsheng Zhou, Jiancheng Pan, Yanxing Liu, Xiao He, Jiahao Li, Yuze Sun, Xiaomeng Huang, Zhenyu Zhang, Ran Ma, Yuhan Liu, Zijian Zhuang, Shuai Yi, Yixiong Zou, Lingyi Hong, Mingxi Chen, Runze Li, Xingdong Sheng, Wenqiang Zhang, Weisen Chen, Yongxin Yan, Xinguo Chen, Yuanjie Shao, Zhengrong Zuo, Nong Sang, Hao Wu, Haoran Sun, Shuming Hu, Yan Zhang, Zhiguang Shi, Yu Zhang, Chao Chen, Tao Wang, Da Feng, Linhai Zhuo, Ziming Lin, Yali Huang, Jie Me, Yiming Yang, Mi Guo, Mingyuan Jiu, Mingliang Xu, Maomao Xiong, Qunshu Zhang, Xinyu Cao, Yuqing Yang, Dianmo Sheng, Xuanpu Zhao, Zhiyu Li, Xuyang Ding, Wenqian Li
PDF
NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, Yufei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinglong Li, Xiangyu Lu, Yi Ren, Yuting Liu, Meng Zhang, Xiang Chen, Qiyuan Guan, Jiangxin Dong, Jinshan Pan, Conglin Gou, Qirui Yang, Fangpu Zhang, Yunlong Lin, Sixiang Chen, Guoxi Huang, Ruirui Lin, Yan Zhang, Jingyu Yang, Huanjing Yue, Jiyuan Chen, Qiaosi Yi, Hongjun Wang, Chenxi Xie, Shuai Li, Yuhui Wu, Kaiyi Ma, Jiakui Hu, Juncheng Li, Liwen Pan, Guangwei Gao, Wenjie Li, Zhenyu Jin, Heng Guo, Zhanyu Ma, Yubo Wang, Jinghua Wang, Wangzhi Xing, Anjusree Karnavar, Diqi Chen, Mohammad Aminul Islam, Hao Yang, Ruikun Zhang, Liyuan Pan, Qianhao Luo, Xin Cao, Han Zhou, Yan Min, Wei Dong, Jun Chen, Taoyi Wu, Weijia Dou, Yu Wang, Shengjie Zhao, Yongcheng Huang, Xingyu Han, Anyan Huang, Hongtao Wu, Hong Wang, Yefeng Zheng, Abhijeet Kumar, Aman Kumar, Marcos V. Conde, Paula Garrido, Daniel Feijoo, Juan C. Benito, Guanglu Dong, Xin Lin, Siyuan Liu, Tianheng Zheng, Jiayu Zhong, Shouyi Wang, Xiangtai Li, Lanqing Guo, Lu Qi, Chao Ren, Shuaibo Wang, Shilong Zhang, Wanyu Zhou, Yunze Wu, Qinzhong Tan, Jieyuan Pei, Zhuoxuan Li, Jiayu Wang, Haoyu Bian, Haoran Sun, Subhajit Paul, Ni Tang, Junhao Huang, Zihan Cheng, Hongyun Zhu, Yuehan Wu, Kaixin Deng, Huang Ouyang, Tianxin Xiao, Fan Yang, Zhizun Luo, Zeyu Xiao, Zhuoyuan Li, Pham Hoang Le Nguyen, Dinh Thien An, Luu Thanh Son, Kiet Van Nguyen, Ronghua Xu, Xianmin Tian, Weijian Zhou, Jiacheng Zhang, Yuqian Chen, Yihang Duan, Yujie Wu, Suresh Raikwar, Arsh Garg, Kritika Kritika, Jianhua Zheng, Xiaoshan Ma, Ruolin Zhao, Yongyu Yang, Yongsheng Liang, Guiming Huang, Qiang Li, Hongbin Zhang, Xiangyu Zheng, A. N. Rajagopalan
PDF
NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results Sangmin Lee, Eunpil Park, Angel Canelo, Hyunhee Park, Youngjo Kim, Hyung-Ju Chun, Xin Jin, Chongyi Li, Chun-Le Guo, Radu Timofte, Qi Wu, Tianheng Qiu, Yuchun Dong, Shenglin Ding, Guanghua Pan, Weiyu Zhou, Tao Hu, Yixu Feng, Duwei Dai, Yu Cao, Peng Wu, Wei Dong, Yanning Zhang, Qingsen Yan, Simon J. Larsen, Senyan Xu, Xingbo Wang, Ruixuan Jiang, Xin Lu, Marcos V. Conde, Javier Abad-Hernández, Álvaro García-Lara, Daniel Feijoo, Álvaro García, Zeyu Xiao, Zhuoyuan Li
PDF
NTIRE 2025 Challenge on Event-Based Image Deblurring: Methods and Results Lei Sun, Andrea Alfarano, Peiqi Duan, Shaolin Su, Kaiwei Wang, Boxin Shi, Radu Timofte, Danda Pani Paudel, Luc Van Gool, Qinglin Liu, Wei Yu, Xiaoqian Lv, Lu Yang, Shuigen Wang, Shengping Zhang, Xiangyang Ji, Long Bao, Yuqiang Yang, Jinao Song, Ziyi Wang, Shuang Wen, Heng Sun, Kean Liu, Mingchen Zhong, Senyan Xu, Zhijing Sun, Jiaying Zhu, Chengjie Ge, Xingbo Wang, Yidi Liu, Xin Lu, Xueyang Fu, Zheng-Jun Zha, Dawei Fan, Dafeng Zhang, Yong Yang, Siru Zhang, Qinghua Yang, Hao Kang, Huiyuan Fu, Heng Zhang, Hongyuan Yu, Zhijuan Huang, Shouyan Wei, Feng Li, Runmin Cong, Weiqi Luo, Mingyun Lin, Chenxu Jiang, Hongyi Liu, Lei Yu, Weilun Li, Jiajun Zhai, Tingting Lin, Shuang Ma, Sai Zhou, Zhanwen Liu, Yang Wang, Eiffel Chong, Nuwan Bandara, Thivya Kandappu, Archan Misra, Yihang Chen, Zhan Li, Weijun Yuan, Wenzhuo Wang, Boyang Yao, Zhanglu Chen, Yijing Sun, Tianjiao Wan, Zijian Gao, Qisheng Xu, Kele Xu, Yukun Zhang, Yu He, Xiaoyan Xie, Tao Fu, Yashu Guatamkumar Patel, Vihar Ramesh Jain, Divesh Basina, Rishik Ashili, Manish Kumar Manjhi, Sourav Kumar, Prinon Benny, Himanshu Ghunawat, B. Sri Sairam Gautam, Anett Varghese, Abhishek Yadav
PDF
NTIRE 2025 Challenge on HR Depth from Images of Specular and Transparent Surfaces Pierluigi Zama Ramirez, Fabio Tosi, Luigi Di Stefano, Radu Timofte, Alex Costanzino, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Zhe Zhang, Yang Yang, Wu Chen, Anlong Ming, Mingshuai Zhao, Mengying Yu, Shida Gao, Xiangfeng Wang, Feng Xue, Jun Shi, Yong Yang, Yong A, Yixiang Jin, Dingzhe Li, Aryan Shukla, Liam Frija-Altarac, Matthew Toews, Hui Geng, Tianjiao Wan, Zijian Gao, Qisheng Xu, Kele Xu, Zijian Zang, Jameer Babu Pinjari, Kuldeep Purohit, Mykola Lavreniuk, Jing Cao, Shenyi Li, Kui Jiang, Junjun Jiang, Yong Huang
PDF
NTIRE 2025 Challenge on Image Super-Resolution (x4): Methods and Results Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu, Pufan Xu, Zhijuan Huang, Shuyuan Cui, Peng Guo, Jiahui Liu, Dongkai Zhang, Heng Zhang, Huiyuan Fu, Huadong Ma, Yanhui Guo, Sisi Tian, Xin Li, Jinwen Liang, Jie Liu, Jie Tang, Gangshan Wu, Zeyu Xiao, Zhuoyuan Li, Yinxiang Zhang, Wenxuan Cai, Vijayalaxmi Ashok Aralikatti, Nikhil Akalwadi, G. Gyaneshwar Rao, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudenagudi, Marcos V. Conde, Alejandro Merino, Bruno Longarela, Javier Abad, Weijun Yuan, Zhan Li, Zhanglu Chen, Boyang Yao, Aagam Jain, Milan Kumar Singh, Ankit Kumar, Shubh Kawa, Divyavardhan Singh, Anjali Sarvaiya, Kishor P. Upla, Raghavendra Ramachandra, Chia-Ming Lee, Yu-Fan Lin, Chih-Chung Hsu, Risheek V. Hiremath, Palani Yashaswini, Yuxuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Jingwei Liao, Yuqing Yang, Wenda Shao, Junyi Zhao, Qisheng Xu, Kele Xu, Sunder Ali Khowaja, Ik Hyun Lee, Snehal Singh Tomar, Rajarshi Ray, Klaus Mueller, Sachin Chaudhary, Surya Vashisth, Akshay Dudhane, Praful Hambarde, Satya Naryan Tazi, Prashant W. Patil, Santosh Kumar Vipparthi, Subrahmanyam Murala, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Zahra Moammeri, Ahmad Mahmoudi-Aznaveh, Ali Karbasi, Hossein Motamednia, Liangyan Li, Guanhua Zhao, Kevin Le, Yimo Ning, Haoxuan Huang, Jun Chen
PDF
NTIRE 2025 Challenge on Light Field Image Super-Resolution: Methods and Results Yingqian Wang, Zhengyu Liang, Fengyuan Zhang, Lvli Tian, Longguang Wang, Juncheng Li, Jungang Yang, Radu Timofte, Yulan Guo, Kai Jin, Zeqiang Wei, Angulia Yang, Di Wu, Mingzhi Gao, Xiuzhuang Zhou, Yue Yan, Yuaho Wang, Shuang Chen, Zeping Tian, Yizhi Hu, Yao Lu, Haosong Liu, Xiancheng Zhu, Huanqiang Zeng, Jianqing Zhu, Yifan Shi, Junhui Hou, Mingyang Yu, Zhijian Wu, Dingjiang Huang, Wenli Zheng, Zekai Xu, Huiyuan Fu, Heng Zhang, Zhijuan Huang, Hongyuan Yu, Zeke Zexi Hu, Haodong Chen, Vera Yuk Ying Chung, Xiaoming Chen, Zean Chen, Yeyao Chen, Gangyi Jiang, Haiyong Xu, Ting Luo, Guanglong Liao, Danhao Zhang, Siyu Zhang, Wendong Mao, Zhongfeng Wang, Sunita Arya, Abhishek Kumar Sinha, S. Manthira Moorthi, Hao Zhang, Hao Sheng, Da Yang, Zhenglong Cui, Shuai Wang, Haotian Zhang, Xingzheng Wang, Yuanbo Huang, Jiahao Lin, Yuhang Lin, Ahmed Salem, Ebrahem Elkady, Hatem Ibrahem, Jae-Won Suh, Hyun-Soo Kang, Changguang Wu, Hao Hou, Pengpeng Li, Peng Huang, Jiangxin Dong, Jinhui Tang
PDF
NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results Xiaoning Liu, Zongwei Wu, Florin-Alexandru Vasluianu, Hailong Yan, Bin Ren, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Kangbiao Shi, Yixu Feng, Tao Hu, Yu Cao, Peng Wu, Yijin Liang, Yanning Zhang, Qingsen Yan, Han Zhou, Wei Dong, Yan Min, Mohab Kishawy, Jun Chen, Pengpeng Yu, Anjin Park, Seung-Soo Lee, Young-Joon Park, Zixiao Hu, Junyv Liu, Huilin Zhang, Jun Zhang, Fei Wan, Bingxin Xu, Hongzhe Liu, Cheng Xu, Weiguo Pan, Songyin Dai, Xunpeng Yi, Qinglong Yan, Yibing Zhang, Jiayi Ma, Changhui Hu, Kerui Hu, Donghang Jing, Tiesheng Chen, Zhi Jin, Hongjun Wu, Biao Huang, Haitao Ling, Jiahao Wu, Dandan Zhan, G. Gyaneshwar Rao, Vijayalaxmi Ashok Aralikatti, Nikhil Akalwadi, Ramesh Ashok Tabib, Uma Mudenagudi, Ruirui Lin, Guoxi Huang, Nantheera Anantrasirichai, Qirui Yang, Alexandru Brateanu, Ciprian Orhei, Cosmin Ancuti, Daniel Feijoo, Juan C. Benito, Álvaro García, Marcos V. Conde, Yang Qin, Raul Balmez, Anas M. Ali, Bilel Benjdira, Wadii Boulila, Tianyi Mao, Huan Zheng, Yanyan Wei, Shengeng Tang, Dan Guo, Zhao Zhang, Sabari Nathan, K. Uma, A. Sasithradevi, B. Sathya Bama, S. Mohamed Mansoor Roomi, Ao Li, Xiangtao Zhang, Zhe Liu, Yijie Tang, Jialong Tang, Zhicheng Fu, Gong Chen, Joe Nasti, John Nicholson, Zeyu Xiao, Zhuoyuan Li, Ashutosh Kulkarni, Prashant W. Patil, Santosh Kumar Vipparthi, Subrahmanyam Murala, Duan Liu, Weile Li, Hangyuan Lu, Rixian Liu, Tengfeng Wang, Jinxing Liang, Chenxin Yu
PDF
NTIRE 2025 Challenge on Night Photography Rendering Egor I. Ershov, Sergey Korchagin, Aleksei Khalin, Artyom Panshin, Arseniy P. Terekhin, Ekaterina Zaychenkova, Georgiy Lobarev, Vsevolod Plokhotnyuk, Denis Abramov, Elisey Zhdanov, Sofia Dorogova, Yasin Mamedov, Nikola Banic, Georgy Perevozchikov, Radu Timofte, Lize Zhang, Yuqian Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Yibin Huang, Guangqi Shao, Xiaotao Wang, Lei Lei, Sishun Pan, Zhiqiang Zhong, Yang Yang, Anas M. Ali, Hamad Aloqayli, Bilel Benjdira, Wadii Boulila, Xiaoyang Ma, Zijun Gao, Leyi Xing, Zongqi He, Yushen Zuo, Zhe Xiao, Kin-Chung Chan, Hanmin Li, Jun Xiao, Kin-Man Lam, Yunpeng Wu, Dmitrij Manzura, Daniil Storonkin, Weixin Guo, Kele Xu, Qisheng Xu, Zijian Gao, Tianjiao Wan, Buda Vampilov, Furkan Kinli, Furkan Kiraç
PDF
NTIRE 2025 Challenge on RAW Image Restoration and Super-Resolution Marcos V. Conde, Radu Timofte, Zihao Lu, Xiangyu Kong, Xiaoxia Xing, Fan Wang, Suejin Han, MinKyu Park, Tianyu Hao, Yuhong He, Ruoqi Li, Yueqi Yang, Jianyang Yu, Kele Xu, Zisheng Xu, Yong Dou, Watchara Ruangsang, Ruixuan Jiang, Senyan Xu, Siyuan Jiang, Xueyang Fu, Zheng-Jun Zha, Jiajie Lu, Xiang Yu, Minmin Yi, Yuanjia Chen, Liwen Zhang, Zijie Jin, Tianyu Zhang, Xin Lu, Yeda Chen, Dong Liu, Li Pang, Yuhang Yang, Hongzhong Wang, Xiangyong Cao, Cheng Li, Lian Liu, Wei Song, Heng Sun, Yubo Wang, Jinghua Wang, Guanlan Hong
PDF
NTIRE 2025 Challenge on Real-World Face Restoration: Methods and Results Zheng Chen, Jingkai Wang, Kai Liu, Jue Gong, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Jianxing Zhang, Jinlong Wu, Jun Wang, Zheng Xie, Hakjae Jeon, Suejin Han, Hyung-Ju Chun, Hyunhee Park, Zhicun Yin, Junjie Chen, Ming Liu, Xiaoming Li, Chao Zhou, Wangmeng Zuo, Weixia Zhang, Dingquan Li, Kede Ma, Yun Zhang, Zhuofan Zheng, Yuyue Liu, Shizhen Tang, Zihao Zhang, Yi Ning, Hao Jiang, Wenjie An, Kangmeng Yu, Chenyang Wang, Kui Jiang, Xianming Liu, Junjun Jiang, Yingfu Zhang, Gang He, Siqi Wang, Kepeng Xu, Zhenyang Liu, Changxin Zhou, Shanlan Shen, Yubo Duan, Yiang Chen, Jin Guo, Mengru Yang, Jen-Wei Lee, Chia-Ming Lee, Chih-Chung Hsu, Hu Peng, Chunming He
PDF
NTIRE 2025 Challenge on Short-Form UGC Video Quality Assessment and Enhancement: KwaiSR Dataset and Study Xin Li, Xijun Wang, Bingchen Li, Kun Yuan, Yizhen Shao, Suhang Yao, Ming Sun, Chao Zhou, Radu Timofte, Zhibo Chen
PDF
NTIRE 2025 Challenge on Short-Form UGC Video Quality Assessment and Enhancement: Methods and Results Xin Li, Kun Yuan, Bingchen Li, Fengbin Guan, Yizhen Shao, Zihao Yu, Xijun Wang, Yiting Lu, Wei Luo, Suhang Yao, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Yabin Zhang, Ao-Xiang Zhang, Tianwu Zhi, Jianzhao Liu, Yang Li, Jingwen Xu, Yiting Liao, Yushen Zuo, Mingyang Wu, Renjie Li, Shengyun Zhong, Zhengzhong Tu, Yufan Liu, Xiangguang Chen, Zuowei Cao, Minhao Tang, Shan Liu, Kexin Zhang, Jingfen Xie, Yan Wang, Kai Chen, Shijie Zhao, Yunchen Zhang, Xiangkai Xu, Hong Gao, Ji Shi, Yiming Bao, Xiugang Dong, Xiangsheng Zhou, Yaofeng Tu, Ying Liang, Yiwen Wang, Xinning Chai, Yuxuan Zhang, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song, Wei Sun, Kang Fu, Linhan Cao, Dandan Zhu, Kaiwei Zhang, Yucheng Zhu, Zicheng Zhang, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Zhi Jin, Jiawei Wu, Wei Wang, Wenjian Zhang, Yuhai Lan, Gaoxiong Yi, Hengyuan Na, Wang Luo, Di Wu, Mingyin Bai, Jiawang Du, Zilong Lu, Zhenyu Jiang, Hui Zeng, Ziguan Cui, Zongliang Gan, Guijin Tang, Xinglin Xie, Kehuan Song, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Puhua Chen, Ha Thu Nguyen, Katrien De Moor, Seyed Ali Amirshahi, Mohamed-Chaker Larabi, Qi Tang, Linfeng He, Zhiyong Gao, Zixuan Gao, Guohua Zhang, Zhiye Huang, Yi Deng, Qingmiao Jiang, Lu Chen, Yi Yang, Xi Liao, Nourine Mohammed Nadir, Yuxuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Meiqin Liu, Chao Yao, Yao Zhao
PDF
NTIRE 2025 Challenge on Single Image Reflection Removal in the Wild: Datasets, Methods and Results Kangning Yang, Jie Cai, Ling Ouyang, Florin-Alexandru Vasluianu, Radu Timofte, Jiaming Ding, Huiming Sun, Lan Fu, Jinlong Li, Chiu Man Ho, Zibo Meng, Mingjia Li, Hainuo Wang, Qiming Hu, Jiarui Wang, Hao Zhao, Jin Hu, Xiaojie Guo, Mengru Yang, Jing He, Yiqing Wang, Zhiyang Chen, Hao Fang, Wei Zhang, Runmin Cong, Dheeraj Damodhar Hegde, Jatin Kalal, Nikhil Akalwadi, Ramesh Ashok Tabib, Uma Mudenagudi, Yu-Fan Lin, Chia-Ming Lee, Chih-Chung Hsu, Mengxin Zhang, Sabari Nathan, K. Uma, A. Sasithradevi, B. Sathya Bama, S. Mohamed Mansoor Roomi, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Wei Dong, Yunzhe Li, Ali Hussein, Han Zhou, Jun Chen, Zeyu Xiao, Zhuoyuan Li
PDF
NTIRE 2025 Challenge on Text to Image Generation Model Quality Assessment Shuhao Han, Haotian Fan, Fangyuan Kong, Wenjie Liao, Chunle Guo, Chongyi Li, Radu Timofte, Liang Li, Tao Li, Junhui Cui, Yunqiu Wang, Yang Tai, Jingwei Sun, Jianhui Sun, Xinli Yue, Tianyi Wang, Huan Hou, Junda Lu, Xinyang Huang, Zitang Zhou, Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao, Trong-Hieu Nguyen-Mau, Minh-Hoang Le, Minh-Khoa Le-Phan, Duy-Nam Ly, Hai-Dang Nguyen, Minh-Triet Tran, Yukang Lin, Yan Hong, Chuanbiao Song, Siyuan Li, Jun Lan, Zhichao Zhang, Xinyue Li, Wei Sun, Zicheng Zhang, Yunhao Li, Xiaohong Liu, Guangtao Zhai, Zitong Xu, Huiyu Duan, Jiarui Wang, Guangji Ma, Liu Yang, Lu Liu, Qiang Hu, Xiongkuo Min, Zichuan Wang, Zhenchen Tang, Bo Peng, Jing Dong, Fengbin Guan, Zihao Yu, Yiting Lu, Wei Luo, Xin Li, Minhao Lin, Haofeng Chen, Xuanxuan He, Kele Xu, Qisheng Xu, Zijian Gao, Tianjiao Wan, Bo-Cheng Qiu, Chih-Chung Hsu, Chia-Ming Lee, Yu-Fan Lin, Bo Yu, Zehao Wang, Da Mu, Mingxiu Chen, Junkang Fang, Huamei Sun, Wending Zhao, Zhiyu Wang, Wang Liu, Weikang Yu, Puhong Duan, Bin Sun, Xudong Kang, Shutao Li, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Jiarong He, Zhishan Qiao, Yongqing Huang, Zewen Chen, Zhe Pang, Juan Wang, Jian Guo, Zhizhuo Shao, Ziyu Feng, Bing Li, Weiming Hu, Hesong Li, Dehua Liu, Zeming Liu, Qingsong Xie, Ruichen Wang, Zhihao Li, Yuqi Liang, Jianqi Bi, Jun Luo, Junfeng Yang, Can Li, Jing Fu, Hongwei Xu, Mingrui Long, Lulin Tang
PDF
NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results Nickolay Safonov, Alexey Bryntsev, Andrey Moskalenko, Dmitry Kulikov, Dmitriy S. Vatolin, Radu Timofte, Haibo Lei, Qifan Gao, Qing Luo, Yaqing Li, Jie Song, Shaozhe Hao, Meisong Zheng, Jingyi Xu, Chengbin Wu, Jiahui Liu, Ying Chen, Xin Deng, Mai Xu, Peipei Liang, Jie Ma, Junjie Jin, Yingxue Pang, Fangzhou Luo, Kai Chen, Shijie Zhao, Mingyang Wu, Renjie Li, Yushen Zuo, Zhengzhong Tu, Shengyun Zhong
PDF
NTIRE 2025 Challenge on Video Quality Enhancement for Video Conferencing: Datasets, Methods and Results Varun Jain, Zongwei Wu, Quan Zou, Louis Florentin, Henrik Turbell, Sandeep Siddhartha, Radu Timofte, Qifan Gao, Linyan Jiang, Qing Luo, Jie Song, Yaqing Li, Summer Luo, Mae Chen, Stefan Liu, Danie Song, Huimin Zeng, Qi Chen, Ajeet Kumar Verma, Shweta Tripathi, Vinit Jakhetiya, Badri N. Subhdhi, Sunil Jaiswal
PDF
NTIRE 2025 Image Shadow Removal Challenge Report Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Cailian Chen, Zongwei Wu, Radu Timofte, Mingjia Li, Jin Hu, Hainuo Wang, Hengxing Liu, Jiarui Wang, Qiming Hu, Xiaojie Guo, Xin Lu, Jiarong Yang, Yuanfei Bao, Anya Hu, Zihao Fan, Kunyu Wang, Jie Xiao, Xi Wang, Xueyang Fu, Zheng-Jun Zha, Yu-Fan Lin, Chia-Ming Lee, Chih-Chung Hsu, Xingbo Wang, Dong Li, Yuxu Chen, Bin Chen, Yuanbo Zhou, Yuanbin Chen, Hongwei Wang, Jiannan Lin, Qinquan Gao, Tong Tong, Zhao Zhang, Yanyan Wei, Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen, Haobo Liang, Jiajie Jing, Junyu Li, Yan Yang, Seoyeon Lee, Chaewon Kim, Ziyu Feng, Shidi Chen, Bowen Luan, Zewen Chen, Vijayalaxmi Ashok Aralikatti, G. Gyaneshwar Rao, Nikhil Akalwadi, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudenagudi, Anas M. Ali, Bilel Benjdira, Wadii Boulila, Alexandru Brateanu, Cosmin Ancuti, Tanmay Chaturvedi, Manish Kumar, Anmol Srivastav, Daksh Trivedi, Shashwat Thakur, Kishor P. Upla, Zeyu Xiao, Zhuoyuan Li, Boda Zhou, Shashank Shekhar, Kele Xu, Qisheng Xu, Zijian Gao, Tianjiao Wan, Suiyi Zhao, Bo Wang, Yan Luo, Mingshen Wang, Yilin Zhang
PDF
NTIRE 2025 the 2nd Restore Any Image Model (RAIM) in the Wild Challenge Jie Liang, Radu Timofte, Qiaosi Yi, Zhengqiang Zhang, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang, Tianyu Hao, Lin Wang, Zhe Xiao, Pengzhou Ji, Shupeng Zhong, Xiangming Wang, Jiaqi Yan, Sishun Pan, Ce Wang, Yibin Huang, Zhang Sheng Wang, Haobo Liang, Zhenghao Pan, Jinjian Wu, Yushen Zuo, Yuanbo Zhou
PDF
NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results Xiaohong Liu, Xiongkuo Min, Qiang Hu, Xiaoyun Zhang, Jie Guo, Guangtao Zhai, Shushi Wang, Yingjie Zhou, Lu Liu, Jingxin Li, Liu Yang, Farong Wen, Li Xu, Yanwei Jiang, Xilei Zhu, Chunyi Li, Zicheng Zhang, Huiyu Duan, Xiele Wu, Yixuan Gao, Yuqin Cao, Jun Jia, Wei Sun, Jiezhang Cao, Radu Timofte, Baojun Li, Jiamian Huang, Dan Luo, Tao Liu, Weixia Zhang, Bingkun Zheng, Junlin Chen, Ruikai Zhou, Meiya Chen, Yu Wang, Hao Jiang, Xiantao Li, Yuxiang Jiang, Jun Tang, Yimeng Zhao, Bo Hu, Zelu Qi, Chaoyang Zhang, Fei Zhao, Ping Shi, Lingzhi Fu, Heng Cong, Shuai He, Rongyu Zhang, Jiarong He, Zongyao Hu, Wei Luo, Zihao Yu, Fengbin Guan, Yiting Lu, Xin Li, Zhibo Chen, Mengjing Su, Yi Wang, Tuo Chen, Chunxiao Li, Shuaiyu Zhao, Jiaxin Wen, Chuyi Lin, Sitong Liu, Ningxin Chu, Jing Wan, Yu Zhou, Baoying Chen, Jishen Zeng, Jiarui Liu, Xianjin Liu, Xin Chen, Lanzhi Zhou, Hangyu Li, You Han, Bibo Xiang, Zhenjie Liu, Jianzhang Lu, Jialin Gui, Renjie Lu, Shangfei Wang, Donghao Zhou, Jingyu Lin, Quanjian Song, Jiancheng Huang, Yufeng Yang, Changwei Wang, Shupeng Zhong, Yang Yang, Lihuo He, Jia Liu, Yuting Xing, Tida Fang, Yuchun Jin
PDF
OccludeNeRF: Geometry-Aware 3D Scene Inpainting with Collaborative Score Distillation in NeRF Jingyu Shi, Achleshwar Luthra, Jiazhi Li, Xiang Gao, Xiyun Song, Zongfang Lin, Xianfeng David Gu, Heather Yu
PDF
On the Robustness of GUI Grounding Models Against Image Attacks Haoren Zhao, Tianyi Chen, Zhen Wang
PDF
On the Suitability of Reinforcement Fine-Tuning to Visual Tasks Xiaxu Chen, Wei Li, Chunxu Liu, Chi Xie, Xiaoyan Hu, Chengqian Ma, Feng Zhu, Rui Zhao
PDF
Online Gaussian Test-Time Adaptation of Vision-Language Models Clément Fuchs, Maxime Zanella, Christophe De Vleeschouwer
PDF
OnlyFlow: Optical Flow Based Motion Conditioning for Video Diffusion Models Mathis Koroglu, Hugo Caselles-Dupré, Guillaume Jeanneret, Matthieu Cord
PDF
Open Dataset and Enhancement Method for Long-Wave Thermal Diurnal Material Classification Michael Pergeorelis, Tyler Rust, Chandra Kambhamettu
PDF
OpenSplat3D: Open-Vocabulary 3D Instance Segmentation Using Gaussian Splatting Jens Piekenbrinck, Christian Schmidt, Alexander Hermans, Narunas Vaskevicius, Timm Linder, Bastian Leibe
PDF
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection Shuming Liu, Chen Zhao, Fatimah Zohra, Mattia Soldan, Alejandro Pardo, Mengmeng Xu, Lama Alssum, Merey Ramazanova, Juan León Alcázar, Anthony Cioppa, Silvio Giancola, Carlos Hinojosa, Bernard Ghanem
PDF
Optimising Vision Transformer Performance on Limited Datasets: A Multi-Gradient Approach Mohsin Ali, Haider Raza, John Q. Gan, Muhammad Haris
PDF
Out-of-Distribution Detection with Adversarial Outlier Exposure Thomas Botschen, Konstantin Kirchheim, Frank Ortmeier
PDF
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art Youssef Shoeb, Azarm Nowzad, Hanno Gottschalk
PDF
Outlier-Robust Multi-Model Fitting on Quantum Annealers Saurabh Pandey, Luca Magri, Federica Arrigoni, Vladislav Golyanik
PDF
Overview of the 1st International Workshop on Interactive Video Search and Exploration Luca Rossetto, George Awad, Werner Bailer, Cathal Gurrin, Björn Þór Jónsson, Jakub Lokoc, Stevan Rudinac, Klaus Schoeffmann
PDF
PAN-RSVQA: Vision Foundation Models as Pseudo-ANnotators for Remote Sensing Visual Question Answering Christel Chappuis, Gencer Sümbül, Syrielle Montariol, Sylvain Lobry, Devis Tuia
PDF
PanoDreamer: Consistent Text to 360-Degree Scene Generation Zhexiao Xiong, Zhang Chen, Zhong Li, Yi Xu, Nathan Jacobs
PDF
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation Leonard Waldmann, Ando Shah, Yi Wang, Nils Lehmann, Adam J. Stewart, Zhitong Xiong, Xiao Xiang Zhu, Stefan Bauer, John Chuang
PDF
PartStickers: Generating Parts of Objects for Rapid Prototyping Mo Zhou, Josh Myers-Dean, Danna Gurari
PDF
PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge Manuel Barusco, Francesco Borsatti, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto
PDF
PatchContrast: Self-Supervised Pre-Training for 3D Object Detection Oren Shrout, Ori Nizan, Yizhak Ben-Shabat, Ayellet Tal
PDF
PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition Jongseo Lee, Wooil Lee, Gyeong-Moon Park, Seong Tae Kim, Jinwoo Choi
PDF
Perturbed State Space Feature Encoders for Optical Flow with Event Cameras Gokul Raju Govinda Raju, Nikola Zubic, Marco Cannici, Davide Scaramuzza
PDF
PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers Maximilian Augustin, Syed Shakib Sarwar, Mostafa Elhoushi, Yuecheng Li, Sai Qian Zhang, Barbara De Salvo
PDF
PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector Kaidong Li, Tianxiao Zhang, Kuan-Chuan Peng, Guanghui Wang
PDF
Physics-Based Human Pose Estimation from a Single Moving RGB Camera Ayce Idil Aytekin, Chuqiao Li, Diogo C. Luvizon, Rishabh Dabral, Martin R. Oswald, Marc Habermann, Christian Theobalt
PDF
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications Trisanth Srinivasan, Santosh V. Patapati
PDF
PhytoSynth: Leveraging Multi-Modal Generative Model for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach Nitin Rai, Arnold W. Schumann, Nathan Boyd
PDF
PiCaZo: Pixel-Aligned Contrastive Learning for Zero-Shot Domain Adaptation Aniruddh Sikdar, Arya Kishor, Ishika Kadam, Suresh Sundaram
PDF
PineSORT: A Simple Online Real-Time Tracking Framework for Drone Videos in Agriculture Danny Xie-Li, Fabian Fallas-Moya
PDF
PluckeRF: A Line-Based 3D Representation for Few-View Reconstruction Sam Bahrami, Dylan Campbell
PDF
PLVM: A Tuning-Free Approach for Personalized Large Vision-Language Model Chau Pham, Hoang Phan, David S. Doermann, Yunjie Tian
PDF
Polar Coordinate-Based 2D Pose Prior with Neural Distance Field Qi Gan, Sao Mai Nguyen, Eric Fenaux, Stéphan Clémençon, Mounim A. El-Yacoubi
PDF
Pose-Aware Weakly-Supervised Action Segmentation Zhihao Zhao, Reza Ghoddoosian, Isht Dwivedi, Nakul Agarwal, Behzad Dariush
PDF
Pose-to-Pose: A New Task and Benchmark for Human Pose Transition in Yoga Bhat Dittakavi, Swarnim Maheshwari, Vineeth N. Balasubramanian
PDF
PoseGuru: Landmarks for Explainable Pose Correction Using Exemplar-Guided Algorithmic Recourse Bhat Dittakavi, Bharathi Callepalli, Swarnim Maheshwari, Vineeth Balasubramanian
PDF
PoseSynViT: Lightweight and Scalable Vision Transformers for Human Pose Estimation Sonain Jamil
PDF
PPTracker: Tracking UAV Swarms with Prior Prompt Haolin Qin, Tianhao Li, Tingfa Xu, Jingxuan Xu, Yuqiang Fang, Jianan Li
PDF
Predicting Butterfly Species Presence from Satellite Imagery Using Soft Contrastive Regularisation Thijs L. van der Plas, Stephen Law, Michael JO Pocock
PDF
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario Sriram Mandalika, Lalitha V, Athira Nambiar
PDF
Privacy Preserving Ordinal-Meta Learning with VLMs for Fine-Grained Fruit Quality Prediction Riddhi Jain, Manasi Patwardhan, Aayush Mishra, Parijat Deshpande, Beena Rai
PDF
Probabilistic Online Event Downsampling Andreu Girbau-Xalabarder, Jun Nagata, Shinichi Sumiyoshi
PDF
Probabilistic Perspective-N-Lines for Indoor Camera Pose Estimation Xiaowei Chen, Guoliang Fan
PDF
Probing Vulnerabilities of Vision-LiDAR Based Autonomous Driving Systems Siwei Yang, Zeyu Wang, Diego Ortiz Barbosa, Luis Burbano, Murat Kantarcioglu, Alvaro A. Cárdenas, Cihang Xie
PDF
Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians Yixuan Li, Xingjian Ran, Linning Xu, Tao Lu, Mulin Yu, Zhenzhi Wang, Yuanbo Xiangli, Dahua Lin, Bo Dai
PDF
Progressive Autoregressive Video Diffusion Models Desai Xie, Zhan Xu, Yicong Hong, Hao Tan, Difan Liu, Feng Liu, Arie E. Kaufman, Yang Zhou
PDF
Prompt Categories Cluster for Weakly Supervised Semantic Segmentation Wangyu Wu, Xianglin Qiu, Siqi Song, Zhenhong Chen, Xiaowei Huang, Fei Ma, Jimin Xiao
PDF
Prompt the Missing: Prompt-Based Robust Audio-Visual Classification Under Uncertain Modalities Eunju Park
PDF
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval Yuji Nozawa, Yu-Chieh Lin, Kazumoto Nakamura, Youyang Ng
PDF
Prompt-Tuning SAM: From Generalist to Specialist with Only 2048 Parameters and 16 Training Images Tristan Piater, Björn Barz, Alexander Freytag
PDF
PromptNorm: Image Geometry Guides Ambient Light Normalization David Serrano-Lozano, Francisco A. Molina-Bakhos, Danna Xue, Yixiong Yang, Maria Pilligua, Ramon Baldrich, María Vanrell, Javier Vazquez-Corral
PDF
ProtoPatchNet: An Interpretable Patch-Based Prototypical Network Mohana Singh, Vivek B. S., Jayavardhana Gubbi, R. Venkatesh Babu
PDF
Prototype-Based Continual Learning with Label-Free Replay Buffer and Cluster Preservation Loss Agil Aghasanli, Yi Li, Plamen Angelov
PDF
Prototype-Guided Diffusion for Digital Pathology: Achieving Foundation Model Performance with Minimal Clinical Data Ekaterina Redekop, Mara Pleasure, Vedrana Ivezic, Zichen Wang, Kimberly Flores, Anthony Sisk, William Speier, Corey W. Arnold
PDF
PS4PRO: Pixel-to-Pixel Supervision for Photorealistic Rendering and Optimization Yezhi Shen, Qiuchen Zhai, Fengqing Zhu
PDF
Pseudo-Labelling Meets Label Smoothing for Noisy Partial Label Learning Darshana Saravanan, Naresh Manwani, Vineet Gandhi
PDF
Pureformer: Transformer-Based Image Denoising Arnim Gautam, Aditi Pawar, Aishwarya Joshi, Satya Narayan Tazi, Sachin Chaudhary, Praful Hambarde, Akshay Dudhane, Santosh Kumar Vipparthi, Subrahmanyam Murala
PDF
PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild Henghui Ding, Chang Liu, Nikhila Ravi, Shuting He, Yunchao Wei, Song Bai, Philip Torr
PDF
Q-CIDNet: Perceptual Quality Aware Color and Intensity Decoupling Network for Video Quality Enhancement Ajeet Kumar Verma, Shweta Tripathi, Vinit Jakhetiya, Badri N. Subudhi, Sunil Jaiswal
PDF
QID: Efficient Query-Informed ViTs in Data-Scarce Regimes for OCR-Free Document Understanding Binh M. Le, Shaoyuan Xu, Jinmiao Fu, Zhishen Huang, Moyan Li, Yanhui Guo, Hongdong Li, Sameera Ramasinghe, Bryan Wang
PDF
Quadrocular, Neuromorphic Stereo Triangulation and Asynchronous Data Fusion for 3D Object Tracking Jonah Sengupta
PDF
Quality Assessment for Talking Head Videos via Multi-Modal Feature Representation Mengjing Su, Yi Wang, Tuo Chen, Chunxiao Li, Shuaiyu Zhao, Jiaxin Wen, Chuyi Lin, Sitong Liu, Ningxin Chu, Yu Zhou
PDF
Quantized Image Super-Resolution on Mobile NPUs, Mobile AI 2025 Challenge: Report Andrey Ignatov, Georgy Perevozchikov, Radu Timofte, Zhiyu Zhang, Tianxiao Gao, Yukun Yang, Shiai Zhu, Shihao Wang, Kihwan Yoon, Ganzorig Gankhuyag, Hyeon-Cheol Moon, Taehyun Jeong, Yumi Kim, Suhyeon Lee, Jaehun Baek, Jinwoo Jeong, Eunjun Park, Jun Lee, Heejun Lee, Sungjei Kim, Dafeng Zhang, Yong Yang, Heo Myeong Cheol, Yonghyun Park, Jooho Jeong, Wontae Kim, Kanghwan Lee, Diankai Zhang, Biao Wu, Chengjian Zheng, Shaoli Liu, Si Gao, Ning Wang, Mingshen Wang, Zhao Zhang, Suiyi Zhao, Jinhan Guan, Bo Wang, Yan Luo
PDF
Quantum Federated Learning for Multimodal Data: A Modality-Agnostic Approach Atit Pokharel, Ratun Rahman, Thomas Morris, Dinh C. Nguyen
PDF
RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving Yujin Wang, Quanfeng Liu, Zhengxin Jiang, Tianyi Wang, Junfeng Jiao, Hongqing Chu, Bingzhao Gao, Hong Chen
PDF
RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning Yuan Luo, Rudolf Hoffmann, Yan Xia, Olaf Wysocki, Benedikt Schwab, Thomas H. Kolbe, Daniel Cremers
PDF
RAW Image Reconstruction from RGB on Smartphones. NTIRE 2025 Challenge Report Marcos V. Conde, Radu Timofte, Radu Berdan, Beril Besbinar, Daisuke Iso
PDF
Read My Ears! Horse Ear Movement Detection for Equine Affective State Assessment João Alves, Pia Haubro Andersen, Rikke Gade
PDF
Reading in the Dark with Foveated Event Vision Carl Brander, Giovanni Cioffi, Nico Messikommer, Davide Scaramuzza
PDF
Real-Time Pedestrian Detection at the Edge on a Fully Asynchronous Neuromorphic System Hugo Bulzomi, Alimatou Sadia Memudu, Yuta Nakano, Jean Martinet
PDF
Real-Time Ultra-Fine-Grained Surgical Instrument Classification Md. Atabuzzaman, Gino DiMatteo, Hani Alomari, Chiawei Tang, Connor Hale, Adam E. Goode, David Ryan King, Chris Thomas
PDF
ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models Amirhosein Chahe, Lifeng Zhou
PDF
Recursive Multi-Exposure Alignment with Spatiotemporal Decoupling for Efficient Burst HDR and Restoration Tianheng Qiu, Qi Wu, Yuchun Dong, Shenglin Ding, Xuan Huang, Hu Wei, Guanghua Pan
PDF
REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding Sakib Reza, Xiyun Song, Heather Yu, Zongfang Lin, Mohsen Moghaddam, Octavia I. Camps
PDF
ReferGPT: Towards Zero-Shot Referring Multi-Object Tracking Tzoulio Chamiti, Leandro Di Bella, Adrian Munteanu, Nikos Deligiannis
PDF
REJEPA: A Novel Joint-Embedding Predictive Architecture for Efficient Remote Sensing Image Retrieval Shabnam Choudhury, Yash Salunkhe, Sarthak Mehrotra, Biplab Banerjee
PDF
Rel-SA: Alzheimer's Disease Detection Using Relevance-Augmented Self Attention by Inducing Domain Priors in Vision Transformers Madhumitha V, Sunayna Padhye, Shanawaj S. Madarkar, Susmit Agrawal, Konda Reddy Mopuri
PDF
RepFC: Universal Structural Reparametrization Block for High Performance, Lightweight Deep Neural Networks Shambhavi Balamuthu Sampath, Judeson Anthony Fernando, Moritz Thoma, Nael Fasfous, Lukas Frickenstein, Pierpaolo Morì, Manoj Rohit Vemparala, Alexander Frickenstein, Ulf Schlichtmann, Walter Stechele
PDF
Repurposing SAM for User-Defined Semantics Aware Segmentation Rohit Kundu, Sudipta Paul, Arindam Dutta, Amit Roy-Chowdhury
PDF
Rethinking Compressive Sensing: A Compression Framework for Video Super-Resolution Ruthy Katz, Adi Teitel, Moran Mordechay, Adi Falik, Eli Bery, Maya Mayberg
PDF
Rethinking the Role of Spatial Mixing George Cazenavette, Joel Julin, Simon Lucey
PDF
Retinex-Guided Histogram Transformer for Mask-Free Shadow Removal Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen
PDF
Revisiting Multi-Modal LLM Evaluation Jian Lu, Shikhar Srivastava, Junyu Chen, Robik Shrestha, Manoj Acharya, Kushal Kafle, Christopher Kanan
PDF
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models Jierun Chen, Fangyun Wei, Jinjing Zhao, Sizhe Song, Bohuai Wu, Zhuoxuan Peng, S.-H. Gary Chan, Hongyang Zhang
PDF
Revolutionizing Drug Discovery: Integrating Spatial Transcriptomics with Advanced Computer Vision Techniques Zichao Li, Shiqing Qiu, Zong Ke
PDF
RGB Photo Enhancement on Mobile GPUs, Mobile AI 2025 Challenge: Report Andrey Ignatov, Georgy Perevozchikov, Radu Timofte, Wu Pan, Song Wang, Dong Zhang, Zhao Ran, Xiaochen Li, Shichang Ju, Diankai Zhang, Biao Wu, Shaoli Liu, Si Gao, Chengjian Zheng, Ning Wang, Yi Feng, Cailu Wan, Xiangji Wu, Hailong Yan, Ao Li, Xiangtao Zhang, Zhe Liu, Ce Zhu, Le Zhang, Jinjie Zhou, Yang Lu, Feng Duo, Runhua Deng, Xuanyu Chen, Shuhui Xie, Guojie Xiao, Zhifeng Wang, Long Peng, Aiwen Jiang
PDF
Robust 6DoF Pose Estimation Against Depth Noise and a Comprehensive Evaluation on a Mobile Dataset Zixun Huang, Keling Yao, Zhihao Zhao, Chuanyu Pan, Allen Y. Yang
PDF
Robust AD: A Real World Benchmark Dataset for Robustness in Industrial Anomaly Detection Latha Pemula, Dongqing Zhang, Onkar Dabeer
PDF
Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition Xilong Lu, Jun Yu, Yunxiang Zhang, Lingsi Zhu, Yang Zheng, Yongqi Wang, Qiang Ling
PDF
Robustness Evaluation for Video Models with Reinforcement Learning Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Sahand Ghorbanpour, Avisek Naug, Antonio Guillen, Ricardo Luna, Soumyendu Sarkar
PDF
Robusto-1 Dataset: Comparing Humans and VLMs on Real Out-of-Distribution Autonomous Driving VQA from Peru Dunant Cusipuma, David Ortega, Victor Flores-Benites, Arturo Deza
PDF
S-Band SAR Target Classification via 2D and 3D Deep Learning Methods Tyler Rust, Michael Pergeorelis, Chandra Kambhamettu, Colin Kelly
PDF
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications Elías Masquil, Roger Marí, Thibaud Ehret, Enric Meinhardt-Llopis, Pablo Musé, Gabriele Facciolo
PDF
S2p-Hd: GPU-Accelerated Binocular Stereo Pipeline for Large-Scale Same-Date Stereo Tristan Amadei, Enric Meinhardt-Llopis, Carlo de Franchis, Jérémy Anger, Thibaud Ehret, Gabriele Facciolo
PDF
Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task Aviral Chharia, Tianyu Ren, Tomotake Furuhata, Kenji Shimada
PDF
SAGA: Semantic-Aware Gray Color Augmentation for Visible-to-Thermal Domain Adaptation Across Multi-View Drone and Ground-Based Vision Systems Manjunath D, Aniruddh Sikdar, Prajwal Gurunath, Sumanth Udupa, Suresh Sundaram
PDF
Salient Object Detection with Dynamic Convolutions Rohit Venkata Sai Dulam, Chandra Kambhamettu
PDF
SAM4EM: Efficient Memory-Based Two Stage Prompt-Free Segment Anything Model Adapter for Complex 3D Neuroscience Electron Microscopy Stacks Uzair Shah, Marco Agus, Daniya Boges, Vanessa Chiappini, Mahmood Alzubaidi, Jens Schneider, Markus Hadwiger, Pierre J. Magistretti, Mowafa S. Househ, Corrado Calì
PDF
SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos Joshua Li, Fernando Jose Pena Cantu, Emily Yu, Alexander Wong, Yuchen Cui, Yuhao Chen
PDF
SARFormer - An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data Jonathan Prexl, Michael Recla, Michael Schmitt
PDF
SC-NeRF: NeRF-Based Point Cloud Reconstruction Using a Stationary Camera for Agricultural Applications Kibon Ku, Talukder Z. Jubery, Elijah Rodriguez, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy, Baskar Ganapathysubramanian
PDF
Scale-Invariant Implicit Neural Representations for Object Counting Siyuan Xu, Yucheng Wang, Xihaier Luo, Byung-Jun Yoon, Xiaoning Qian
PDF
Scaling Laws in Zero-Shot Gender Classification Using CLIP Lucas M. Ceschini, Gabriel de Oliveira Ramos, Cláudio R. Jung
PDF
Scaling On-Device GPU Inference for Large Generative Models Jiuqiang Tang, Raman Sorokin, Ekaterina Ignasheva, Grant Jensen, Lin Chen, Juhyun Lee, Andrei Kulik, Matthias Grundmann
PDF
Scene-Specific Anomalous Relationship Detection Using Scene Graph Summarization Yu-Chen Lai, Motoharu Sonogashira, Itthisak Phueaksri, Yasutomo Kawanishi
PDF
ScoreCAM++: Gated Score-Weighted Visual Explanations for CNNs Soham Mitra, Atri Sukul, Swalpa Kumar Roy, Pravendra Singh, Vinay Kumar Verma
PDF
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions Yifei Dong, Fengyi Wu, Sanjian Zhang, Guangyu Chen, Yuzhi Hu, Masumi Yano, Jingdong Sun, Siyu Huang, Feng Liu, Qi Dai, Zhi-Qi Cheng
PDF
Seeing like a Cephalopod: Colour Vision with a Monochrome Event Camera Sami Arja, Nimrod Kruger, Alexandre Marcireau, Nicholas Owen Ralph, Saeed Afshar, Gregory Cohen
PDF
Segment Any Primitive: Zero-Shot 3D Primitive Segmentation from Point Cloud Yushan Bai, Shaohu Wang, Rongtao Xu, Yuchuang Tong, Chaoran Xu, Zhengtao Zhang
PDF
Segment AnyNeuron Taha Razzaq, Ahmed Rashid Qazi, Asim Iqbal
PDF
Selective Test-Time Domain Adaptation Using Fisher Information for Robust Facial Expression Recognition In-the-Wild Mohammadmahdi Honarmand, Onur Cezmi Mutlu, Parnian Azizian, Saimourya Surabhi, Dennis P. Wall
PDF
Self-Supervised Pretraining for Fine-Grained Plankton Recognition Joona Kareinen, Tuomas Eerola, Kaisa Kraft, Lasse Lensu, Sanna Suikkanen, Heikki Kälviäinen
PDF
Semantic Matters: Multimodal Features for Affective Analysis Tobias Hallmen, Robin-Nico Kampa, Fabian Deuser, Norbert Oswald, Elisabeth André
PDF
Semantic-Aware Local Image Editing with a Single Mask Operation Dongchao Wen, Zijian Chen, Weihong Deng, Yujiang Tian, Hongzhi Shi, Yingjie Zhang, Xingchen Cui, Jian Zhao, Lingyan Liang, Mei Wang
PDF
SemanticSugarBeets: A Multi-Task Framework and Dataset for Inspecting Harvest and Storage Characteristics of Sugar Beets Gerardus Croonen, Andreas Trondl, Julia Simon, Daniel Steininger
PDF
Semi-Supervised Object-Wise Anomaly Detection for Firearm and Firearm Component Detection in X-Ray Security Imagery Yona Falinie A. Gaus, Brian K. S. Isaac-Medina, Neelanjan Bhowmik, Yam T. Lee, Toby P. Breckon
PDF
Separating Shared and Domain-Specific LoRAs for Multi-Domain Learning Yusaku Takama, Ning Ding, Tatsuya Yokota, Toru Tamaki
PDF
Shopformer: Transformer-Based Framework for Detecting Shoplifting via Human Pose Narges Rashvand, Ghazal Alinezhad Noghre, Armin Danesh Pazho, Babak Rahimi Ardabili, Hamed Tabkhi
PDF
Short-Term 3D Human Mesh Recovery with Virtual Markers Disentanglement Xiyuan Kang, Yi Yuan, Xu Dong, Muhammad Awais, Lilian Tang, Josef Kittler, Zhenhua Feng
PDF
Show or Tell? a Benchmark to Evaluate Visual and Textual Prompts in Semantic Segmentation Gabriele Rosi, Fabio Cermelli
PDF
SILK: Smooth InterpoLation frameworK for Motion In-Betweening Elly Akhoundi, Hung Yu Ling, Anup Anand Deshmukh, Judith Bütepage
PDF
SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging Tan-Hanh Pham, Trong-Duong Bui, Quang Minh Luu, Tan-Huong Pham, Chris Ngo, Truong-Son Hy
PDF
SimCache: Similarity Caching for Efficient VLM-Based Scene Understanding Surya Selvam, Ravi K. Rajendran, Murugan Sankaradas, Anand Raghunathan, Srimat T. Chakradhar
PDF
Single-Stage Uncertainty-Aware Jersey Number Recognition in Soccer Lukasz Grad
PDF
SK-RD4AD : Skip-Connected Reverse Distillation for Robust One-Class Anomaly Detection Eun-Ju Park, Taekyung Kim, Minju Kim, Hojun Lee, Gil-Jun Lee
PDF
Skin Lesion Classification Using Dermoscopic Images and Clinical Metadata: Insights from Multimodal Models Sakib Ahammed, Xia Cui, Wenqi Lu, Moi Hoon Yap
PDF
Skor-xG: SKeleton-ORiented Expected Goal Estimation in Soccer Yizhou Xu, Lars Bretzner, Tiesheng Wang, Atsuto Maki
PDF
Slot Attention-Based Feature Filtering for Few-Shot Learning Javier Ródenas Cumplido, Eduardo Aguilar, Petia Radeva
PDF
SLRTP2025 Sign Language Production Challenge: Methodology, Results and Future Work Harry Walsh, Edward Fish, Ozge Mercanoglu Sincan, Mohamed Ilyes Lakhal, Richard Bowden, Neil Fox, Bencie Woll, Kepeng Wu, Zecheng Li, Weichao Zhao, Haodong Wang, Wengang Zhou, Houqiang Li, Shengeng Tang, Jiayi He, Xu Wang, Ruobei Zhang, Yaxiong Wang, Lechao Cheng, Sümeyye Meryem Tasyürek, Tugçe Kiziltepe, Hacer Yalim Keles
PDF
SmallGS: Gaussian Splatting-Based Camera Pose Estimation for Small-Baseline Videos Yuxin Yao, Yan Zhang, Zhening Huang, Joan Lasenby
PDF
SmartHome-Bench: A Comprehensive Benchmark for Video Anomaly Detection in Smart Homes Using Multi-Modal Large Language Models Xinyi Zhao, Congjing Zhang, Pei Guo, Wei Li, Lin Chen, Chaoyue Zhao, Shuai Huang
PDF
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers Joseph Liu, Joshua Geddes, Ziyu Guo, Haomiao Jiang, Mahesh Kumar Nandwana
PDF
SoccerNet-v3D: Leveraging Sports Broadcast Replays for 3D Scene Understanding Marc Gutiérrez-Pérez, Antonio Agudo
PDF
SoyStageNet: Balancing Accuracy and Efficiency for Real-Time Soybean Growth Stage Detection Abdellah Lakhssassi, Toqi Tahamid Sarker, Khaled R. Ahmed, Naoufal Lakhssassi, Khalid Meksem
PDF
Spatio-Temporal State Space Model for Efficient Event-Based Optical Flow Muhammad Ahmed Humais, Xiaoqian Huang, Hussain M. Sajwani, Sajid Javed, Yahya H. Zweiri
PDF
SPIdepth: Strengthened Pose Information for Self-Supervised Monocular Depth Estimation Mykola Lavreniuk, Alla Lavreniuk
PDF
Splat-SLAM: Globally Optimized RGB-Only SLAM with 3D Gaussians Erik Sandström, Ganlin Zhang, Keisuke Tateno, Michael Oechsle, Michael Niemeyer, Youmin Zhang, Manthan Patel, Luc Van Gool, Martin R. Oswald, Federico Tombari
PDF
SplatMesh: Interactive 3D Segmentation and Editing Using Mesh-Based Gaussian Splatting Kaichen Zhou, Lanqing Hong, Xinhai Chang, Yingji Zhong, Enze Xie, Hao Dong, Zhihao Li, Yongxin Yang, Zhenguo Li, Wei Zhang
PDF
SplatTouch: Explicit 3D Representation Binding Vision and Touch Antonio Luigi Stefani, Niccolò Bisagno, Nicola Conci, Francesco G. B. De Natale
PDF
Sporadic Federated Learning Approach in Quantum Environment to Tackle Quantum Noise Ratun Rahman, Atit Pokharel, Dinh C. Nguyen
PDF
Sport Field Calibration with NeRF-Guided Camera Optimization from a Single Image Liang Fan, Xiaoqian Liu, Malcolm Roberts
PDF
SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports Dheeraj Khanna, Jerrin Bright, Yuhao Chen, John S. Zelek
PDF
SRVP: Strong Recollection Video Prediction Model Using Attention-Based Spatiotemporal Correlation Fusion Yuseon Kim, Kyongseok Park
PDF
SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in Ecology Elena Plekhanova, Damien Robert, Johannes Dollinger, Emilia Arens, Philipp Brun, Jan Dirk Wegner, Niklaus E. Zimmermann
PDF
STAM: Zero-Shot Style Transfer Using Diffusion Model via Attention Modulation Masud An Nur Islam Fahim, Nazmus Saqib, Jani Boutellier
PDF
STAPLE: Siamese Transformer Assisted Pseudo Label Ensembling for Unsupervised Domain Adaptation in No-Reference IQA Arshita Gupta, Zhe Zhu, Tien Bau
PDF
Stochastic-Based Patch Filtering for Few-Shot Learning Javier Ródenas Cumplido, Eduardo Aguilar, Petia Radeva
PDF
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID Yu-Hsi Chen
PDF
StrongSiamTracker: A Siamese Tracker with Dynamic Global Detection for Robust Anti-UAV Tracking Xiaolong Cui, Liu Wan, Lingqi Kong, Jimin Li, Chaojie Zhang, Ruohan Zhao, Panlong Wu, Shan He
PDF
STRRNet: Semantics-Guided Two-Stage Raindrop Removal Network Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinlong Li, Mengfei Han
PDF
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation Thanos Delatolas, Vicky Kalogeiton, Dim P. Papadopoulos
PDF
SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation Yonwoo Choi
PDF
SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer Kang Ding, Chunxuan Jiao, Yunze Hu, Kangjie Zhou, Pengying Wu, Yao Mu, Chang Liu
PDF
SwinPaste: A Swin Transformer-Based Framework for RGB-Guided Thermal Image Super-Resolution Hang Zhong, Yu Wang, Shengjie Zhao
PDF
Syn3DTxt: Embedding 3D Cues for Scene Text Generation Li-Syun Hsiung, Jun-Kai Tu, Kuan-Wu Chu, Yu-Hsuan Chiu, Yan-Tsung Peng, Sheng-Luen Chung, Gee-Sern Hsu
PDF
Synthetic Data Augmentation Using Pre-Trained Diffusion Models for Long-Tailed Food Image Classification GaYeon Koh, Hyun-Jic Oh, Jeonghyun Noh, Won-Ki Jeong
PDF
T-SAM: Transductive Learning for Segment Anything Model Rangel Daroya, Deepak Chandran, Subhransu Maji, Andrea Fanelli
PDF
Talk2Traffic: Interactive and Editable Traffic Scenario Generation for Autonomous Driving with Multimodal Large Language Model Zihao Sheng, Zilin Huang, Yansong Qu, Yue Leng, Sikai Chen
PDF
Task-Agnostic Attacks Against Vision Foundation Models Brian Pulfer, Yury Belousov, Vitaliy Kinakh, Teddy Furon, Slava Voloshynovskiy
PDF
Task-Conditioned Ensemble of Expert Models for Continuous Learning Renu Sharma, Debasmita Pal, Arun Ross
PDF
Task-Informed Meta-Learning for Remote Sensing Gabriel Tseng, Hannah Kerner, David Rolnick
PDF
Task-Level Contrastiveness for Cross-Domain Few-Shot Learning Kristi Topollai, Anna Choromanska
PDF
TB-Bench: Training and Testing Multi-Modal AI for Understanding Spatio-Temporal Traffic Behaviors from Dashcam Images/Videos Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Kentaro Arai, Seiji Totsuka, Hiroshi Ino, Takayuki Okatani
PDF
Temporal Consistent Semantic Video Color Transfer from Multiple References Aupendu Kar, Guan-Ming Su
PDF
TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data Benedikt Blumenstiel, Paolo Fraccaro, Valerio Marsocci, Johannes Jakubik, Stefano Maurogiovanni, Mikolaj Czerkawski, Rocco Sedona, Gabriele Cavallaro, Thomas Brunschwiler, Juan Bernabé-Moreno, Nicolas Longépé
PDF
Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment Juyong Park, Jihun Song, Gyewan Kim, Yoonsuk Hyun
PDF
TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark Forouzan Fallah, Maitreya Patel, Agneet Chatterjee, Vlad I. Morariu, Chitta Baral, Yezhou Yang
PDF
Texture2LoD3: Enabling LoD3 Building Reconstruction with Panoramic Images Wenzhao Tang, Weihang Li, Xiucheng Liang, Olaf Wysocki, Filip Biljecki, Christoph Holst, Boris Jutzi
PDF
The Fourth Monocular Depth Estimation Challenge Anton Obukhov, Matteo Poggi, Fabio Tosi, Ripudaman Singh Arora, Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden, Shuaihang Wang, Zhenxin Ma, Weijie Chen, Baobei Xu, Fengyu Sun, Di Xie, Jiang Zhu, Mykola Lavreniuk, Haining Guan, Qun Wu, Yupei Zeng, Chao Lu, Huanran Wang, GuangYuan Zhou, Haotian Zhang, Jianxiong Wang, Qiang Rao, Chunjie Wang, Xiao Liu, Zhiqiang Lou, Hualie Jiang, Yihao Chen, Rui Xu, Minglang Tan, Zihan Qin, Yifan Mao, Jiayang Liu, Jialei Xu, Yifan Yang, Wenbo Zhao, Junjun Jiang, Xianming Liu, Mingshuai Zhao, Anlong Ming, Wu Chen, Feng Xue, Mengying Yu, Shida Gao, Xiangfeng Wang, Gbenga Omotara, Ramy Farag, Jacket Demby's, Seyed Mohamad Ali Tousi, Guilherme N. DeSouza, Tuan-Anh Yang, Minh-Quang Nguyen, Thien-Phuc Tran, Albert Luginov, Muhammad Shahzad
PDF
The Power of Augmentations in IR Object Detection Ihsan Emre Üstün, Cevahir Çigla
PDF
The Surprising Utility of Group Partitioning in Improving Conformal Prediction of Visual Classifiers Under Distributional Shifts Kowshik Thopalli, Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan
PDF
The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report Bin Ren, Hang Guo, Lei Sun, Zongwei Wu, Radu Timofte, Yawei Li
PDF
The Tenth NTIRE 2025 Image Denoising Challenge Report Lei Sun, Hang Guo, Bin Ren, Luc Van Gool, Radu Timofte, Yawei Li
PDF
The Way up: A Dataset for Hold Usage Detection in Sport Climbing Anna Maschek, David C. Schedl
PDF
Thermal Image Super-Resolution Challenge Results - PBVS 2025 Rafael E. Rivadeneira, Ángel D. Sappa, Riad I. Hammoud, Jiyong Rao, Hang Zhong, Yu Wang, Shengjie Zhao, Zhiwei Zhong, Yung-Hui Li, Shiqi Wang, Qiangqiang Shen, Hanzhang Wang, Xuanqi Zhang
PDF
Thermal Pedestrian Multiple Object Tracking Challenge (TP-MOT) Wassim A. El Ahmar, Ángel D. Sappa, Riad I. Hammoud
PDF
TLAC: Two-Stage LMM Augmented CLIP for Zero-Shot Classification Ans Munir, Faisal Z. Qureshi, Muhammad Haris Khan, Mohsen Ali
PDF
To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition Davide Sferrazza, Gabriele Moreno Berton, Gabriele Trivigno, Carlo Masone
PDF
ToF-360 - A Panoramic Time-of-Flight RGB-D Dataset for Single Capture Indoor Semantic 3D Reconstruction Hideaki Kanayama, Mahdi Chamseddine, Suresh Guttikonda, So Okumura, Soichiro Yokota, Didier Stricker, Jason R. Rambach
PDF
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao
PDF
Toward Automation in Text-Based Video Retrieval with LLM Assistance Khanh-An C. Quan, Qui Ngoc Nguyen, Duc-Tuan Luu
PDF
Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer Daniel Kienzle, Robin Schön, Rainer Lienhart, Shin'ichi Satoh
PDF
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking Huu-Loc Tran, Tinh-Anh Nguyen-Nhu, Huu-Phong Phan-Nguyen, Tien-Huy Nguyen, Nhat-Minh Nguyen-Dich, Anh Dao, Huy-Duc Do, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh
PDF
Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach Pierre Adorni, Minh-Tan Pham, Stéphane May, Sébastien Lefèvre
PDF
Towards Evaluating the Robustness of Visual State Space Models Hashmat Shadab Malik, Fahad Shamshad, Muzammal Naseer, Karthik Nandakumar, Fahad Shahbaz Khan, Salman Khan
PDF
Towards Exploring Continual Learning for Toxicologic Pathology in Pharmaceutical Drug Discovery Arijit Patra, Jinge Wu, Honghan Wu, Anshul Thakur
PDF
Towards Faster and More Compact Foundation Models for Molecular Property Prediction Yasir Ghunaim, Andrés Villa, Gergo Ignacz, Gyorgy Szekely, Motasem Alfarra, Bernard Ghanem
PDF
Towards Fine-Grained Spatial Control for Soccer Game Image Generation Amadou S. Sangare, Adrien Maglo, Baptiste Engel, Mohamed Chaouch
PDF
Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model Zelu Qi, Ping Shi, Chaoyang Zhang, Shuqi Wang, Fei Zhao, Da Pan, Zefeng Ying
PDF
Towards Low-Latency Event-Based Obstacle Avoidance on a FPGA-Drone Pietro Bonazzi, Christian Vogt, Michael Jost, Lyes Khacef, Federico Paredes-Vallés, Michele Magno
PDF
Towards Robust Multimodal AU Detection: STN-Enhanced Visual Encoding and Audio-Visual Spatial-Temporal Alignment Jun Yu, Yunxiang Zhang, Fengzhao Sun, Leilei Wang, Renjie Lu, Lingsi Zhu, Xilong Lu, Yang Zheng, Yongqi Wang
PDF
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design Wei Dong, Yan Min, Han Zhou, Jun Chen
PDF
Towards Synthetic Concept Activation Vectors via Generative Models Riccardo Campi, Santiago Borrego, Antonio De Santis, Matteo Bianchi, Andrea Tocchetti, Marco Brambilla
PDF
Towards Trustworthy Autonomous Vehicles with Vision-Language Models Under Targeted and Untargeted Adversarial Attacks Awal Ahmed Fime, Md. Zarif Hossain, Saika Zaman, Abdur Rahman Bin Shahid, Ahmed Imteaj
Towards Unconstrained 2D Pose Estimation of the Human Spine Muhammad Saif Ullah Khan, Stephan Krauß, Didier Stricker
PDF
Traffic Sign Recognition Under Visual Perturbations: Shadows, Light Patches, and Simulated Obstructions Muneeb Ahmed Khan, Yujin Choi, Jiho Eum, Heemin Park
PDF
Training Data Reconstruction: Privacy Due to Uncertainty? Christina Runkel, Kanchana Vaishnavi Gandikota, Jonas Geiping, Carola-Bibiane Schönlieb, Michael Moeller
PDF
Training Neural Networks on RAW and HDR Images for Restoration Tasks Andrew Yanzhe Ke, Lei Luo, Xiaoyu Xiang, Yuchen Fan, Rakesh Ranjan, Alexandre Chapiro, Rafal Mantiuk
PDF
Training-Free Color-Style Disentanglement for Constrained Text-to-Image Synthesis Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan
PDF
TrajGNAS: Heterogeneous Multiagent Trajectory Prediction Based on a Graph Neural Architecture Search Yunheng Xu, Jie Chen, Shuoheng Wang, Xinwen Wang
PDF
Transformer-Based Lung Infection Severity Prediction with Cross Attention and Conditional TransMix Augmentation Bouthaina Slika, Fadi Dornaika, Fares Bougourzi, Karim Hammoudi
PDF
TRISHUL: Towards Region Identification and Screen Hierarchy Understanding for Large VLM Based GUI Agents Kunal Singh, Shreyas Singh, Mukund Khanna
PDF
Trustworthy Multi-UAV Collaboration: A Self-Supervised Framework for Explainable and Adversarially Robust Decision-Making Yuwei Chen, Shiyong Chu
PDF
TT3D: Table Tennis 3D Reconstruction Thomas Gossard, Andreas Ziegler, Andreas Zell
PDF
TTGen: Incorporating Test-Time Scaling to Diffusion Models Yuming Qiao, Yuechen Wang, Xudong Zhang, Dan Meng
PDF
Turin3D: Evaluating Adaptation Strategies Under Label Scarcity in Urban LiDAR Segmentation with Semi-Supervised Techniques Luca Barco, Giacomo Blanco, Gaetano Chiriaco, Alessia Intini, Luigi La Riccia, Vittorio Scolamiero, Piero Boccardo, Paolo Garza, Fabrizio Dominici
PDF
Two Views Are Better than One: Monocular 3D Pose Estimation with Multiview Consistency Christian Keilstrup Ingwersen, Rasmus Tirsgaard, Rasmus Nylander, Janus Nørtoft Jensen, Anders Bjorholm Dahl, Morten Rieger Hannemose
PDF
U-Shape Mamba: State Space Model for Faster Diffusion Alex Ergasti, Filippo Botti, Tomaso Fontanini, Claudio Ferrari, Massimo Bertozzi, Andrea Prati
PDF
Uncertainty Aware Training to Improve Uncertainty Active Learning for Semantic Segmentation Moritz Thoma, Tobias Preintner, Emad Aghajanzadeh, Shambhavi Balamuthu Sampath, Pierpaolo Morì, Nael Fasfous, Manoj Rohit Vemparala, Alexander Frickenstein, Daniel Mueller-Gritschneder, Ulf Schlichtmann
PDF
Uncertainty Quantification for Gradient-Based Explanations in Neural Networks Mihir Mulye, Matias Valdenegro-Toro
PDF
Uncertainty-Guided Style-Aware Perceptual Quality Assessment for AI-Generated Images Tushar Shinde, Shivaanee Eswaran
PDF
Uncovering Branch-Specialization in InceptionV1 Using K Sparse Autoencoders Matthew Bozoukov
PDF
Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA Nahid Alam, Karthik Reddy Kanjula, Surya Guthikonda, Shayakh Islam
PDF
Understanding Depth and Height Perception in Large Visual-Language Models Shehreen Azad, Yash Jain, Rishit Garg, Vibhav Vineet, Yogesh S. Rawat
PDF
Understanding the Effect of Using Semantically Meaningful Tokens for Visual Representation Learning Neha Mukund Kalibhat, Priyatham Kattakinda, Sumit Nawathe, Arman Zarei, Nikita Seleznev, Samuel Sharpe, Senthil Kumar, Soheil Feizi
PDF
United We Stand, Divided We Fall: Handling Weak Complementarity for Audio-Visual Emotion Recognition in Valence-Arousal Space Gnana Praveen Rajasekhar, Jahangir Alam, Eric Charton
PDF
UniToken: Harmonizing Multimodal Understanding and Generation Through Unified Visual Encoding Yang Jiao, Haibo Qiu, Zequn Jie, Shaoxiang Chen, Jingjing Chen, Lin Ma, Yu-Gang Jiang
PDF
Universal Shape of Strong Remote Adversarial Patches for Object Detection with Convolutional Neural Networks Kento Oonishi, Tsunato Nakai
PDF
UPPET: Unified Pedestrian Pose Estimation in Thermal Imaging Mickael Cormier, Andreas Specker, Jürgen Beyerer
PDF
V-NAW: Video-Based Noise-Aware Adaptive Weighting for Facial Expression Recognition JunGyu Lee, Kunyoung Lee, Haesol Park, Ig-Jae Kim, Gi Pyo Nam
PDF
V3LMA: Visual 3D-Enhanced Language Model for Autonomous Driving Jannik Lübberstedt, Esteban Rivera, Nico Uhlemann, Markus Lienkamp
PDF
Video, How Do Your Tokens Merge? Sam Pollard, Michael Wray
PDF
ViDROP: Video Dense Representation Through Spatio-Temporal Sparsity Sepehr Sameni, Simon Jenni, Paolo Favaro
PDF
Virtual Pose Coach: A Motion-Retargeting Approach for Pose Training Tzu-Chun Chiu, Ming-Han Lee, Kun-Ru Wu, Yu-Shuen Wang, Yu-Chee Tseng
PDF
Vision Language Models for Massive MIMO Semantic Communication Stephen D. Liang
PDF
VisionCube: 3D-Aware Vision-Language Model for Multi-Step Spatial Reasoning Feiyang Wang, Nan Luo, Wangyu Wu
PDF
VISTA-CLIP: Visual Incremental Self-Tuned Adaptation for Efficient Continual Panoptic Segmentation Manjunath D, Shrikar Madhu, Aniruddh Sikdar, Suresh Sundaram
PDF
Visual Question Answering on Multiple Remote Sensing Image Modalities Hichem Boussaid, Lucrezia Tosato, Flora Weissgerber, Camille Kurtz, Laurent Wendling, Sylvain Lobry
PDF
Visualizing and Controlling Cortical Responses Using Voxel-Weighted Activation Maximization Matthew W. Shinkle, Mark D. Lescroart
PDF
Visually Interpretable Subtask Reasoning for Visual Question Answering Yu Cheng, Arushi Goel, Hakan Bilen
PDF
Vit4V: A Video Classification Method for the Detection of Varroa Destructor from Honeybees Luca Giovannesi, Paolo Russo, Roberto Beraldi
PDF
VNL-STES: A Benchmark Dataset and Model for Spatiotemporal Event Spotting in Volleyball Analytics Hoang Quoc Nguyen, Ankhzaya Jamsrandorj, Vanyi Chao, Yin May Oo, Muhammad Amrulloh Robbani, Kyung-Ryoul Mun, Jinwook Kim
PDF
Vocabulary-Free Few-Shot Learning for Vision-Language Models Maxime Zanella, Clément Fuchs, Ismail Ben Ayed, Christophe De Vleeschouwer
PDF
VolTex: Food Volume Estimation Using Text-Guided Segmentation and Neural Surface Reconstruction Ahmad AlMughrabi, Umair Haroon, Ricardo Marques, Petia Radeva
PDF
VRAG: Retrieval-Augmented Video Question Answering for Long-Form Videos Bao Tran Gia, Khiem Le, Tien Do, Tien-Dung Mai, Thanh Duc Ngo, Duy-Dinh Le, Shin'ichi Satoh
PDF
VRU-CIPI: Crossing Intention Prediction at Intersections for Improving Vulnerable Road Users Safety Ahmed S. Abdelrahman, Mohamed A. Abdel-Aty, Quoc Dai Tran
PDF
WaveDIF: Wavelet Sub-Band Based Deepfake Identification in Frequency Domain Anurag Dutta, Arnab Kumar Das, Ruchira Naskar, Rajat Subhra Chakraborty
PDF
Wavelet-Based Mechanistic Interpretability of Vision Transformers via Frequency-Aware Ablations Sophia J. Abraham, Jonathan D. Hauenstein, Walter J. Scheirer
PDF
Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce Manuel Knott, Divinefavour Odion, Sameer Sontakke, Anup Karwa, Thijs Defraeye
PDF
What Is the Added Value of UDA in the VFM Era? Brunó Bence Englert, Tommie Kerssies, Gijs Dubbelman
PDF
What Makes for a Good Stereoscopic Image? Netanel Tamir, Shir Amir, Ranel Itzhaky, Noam Atia, Shobhita Sundaram, Stephanie Fu, Ron Sokolovsky, Phillip Isola, Tali Dekel, Richard Zhang, Miriam Farber
PDF
Wheat3DGS: In-Field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting Daiwei Zhang, Joaquin Gajardo, Tomislav Medic, Isinsu Katircioglu, Mike Boss, Norbert Kirchgeßner, Achim Walter, Lukas Roth
PDF
When Textures Deceive: Weakly Supervised Industrial Anomaly Detection with Adapted-Loss CycleGAN Tapan Ganatma Nakkina, Yuhao Zhong, Pete Sumethasorn, Haopeng Tian, Satish T. S. Bukkapatnam
PDF
Where Is the Ball: 3D Ball Trajectory Estimation from 2D Monocular Tracking Puntawat Ponglertnapakorn, Supasorn Suwajanakorn
PDF
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models Yuxiang Lin, Jingdong Sun, Zhi-Qi Cheng, Jue Wang, Haomin Liang, Zebang Cheng, Yifei Dong, Jun-Yan He, Xiaojiang Peng, Xian-Sheng Hua
PDF
WildlifeReID-10k: Wildlife Re-Identification Dataset with 10k Individual Animals Lukás Adam, Vojtech Cermák, Kostas Papafitsoros, Lukás Picek
PDF
Window Token Concatenation for Efficient Visual Large Language Models Yifan Li, Wentao Bao, Botao Ye, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong
PDF
WQLCP: Weighted Adaptive Conformal Prediction for Robust Uncertainty Quantification Under Distribution Shifts Shadi Alijani, Homayoun Najjaran
PDF
X-Edit: Detecting and Localizing Edits in Images Altered by Text-Guided Diffusion Models Valentina Bazyleva, Nicolò Bonettini, Gaurav Bharaj
PDF
XiEff Representation for Interpretable Near-Field Imaging Vasyl Vasylenko, Ihor Tymchyshyn, Vitalii Tymchyshyn
PDF
XYScanNet: A State Space Model for Single Image Deblurring Hanzhou Liu, Chengkai Liu, Jiacong Xu, Peng Jiang, Mi Lu
PDF
Z-SASLM: Zero-Shot Style-Aligned SLI Blending Latent Manipulation Alessio Borgi, Luca Maiano, Irene Amerini
PDF
Zero-Shot Denoising for Fluorescence Lifetime Imaging Microscopy with Intensity-Guided Learning Hao Chen, Julian Najera, Dagmawit Geresu, Meenal Datta, Cody J. Smith, Scott S. Howard
PDF
ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving Sheng Yang, Tong Zhan, Shichen Qiao, Jicheng Gong, Qing Yang, Jian Wang, Yanfeng Lu
PDF