BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-Modality Refinement Module

Wang, Dongzhihan; Yang, Yang; Chen, Xuyang; Xu, Liang

doi:10.24963/IJCAI.2025/216

BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-Modality Refinement Module

Dongzhihan Wang, Yang Yang, Xuyang Chen, Liang Xu

IJCAI 2025 pp. 1936-1944

doi:10.24963/IJCAI.2025/216 /ijcai/2025/wang2025ijcai-bright/

Abstract

Visual odometry (VO) plays a crucial role in autonomous driving, robotic navigation, and other related tasks by estimating the position and orientation of a camera based on visual input. Significant progress has been made in data-driven VO methods, particularly those leveraging deep learning techniques to extract image features and estimate camera poses. However, these methods often struggle in low-light conditions because of the reduced visibility of features and the increased difficulty of matching keypoints. To address this limitation, we introduce BrightVO, a novel VO model based on Transformer architecture, which not only performs front-end visual feature extraction, but also incorporates a multi-modality refinement module in the back-end that integrates Inertial Measurement Unit (IMU) data. Using pose graph optimization, this module iteratively refines pose estimates to reduce errors and improve both accuracy and robustness. Furthermore, we create a synthetic low-light dataset, KiC4R, which includes a variety of lighting conditions to facilitate the training and evaluation of VO frameworks in challenging environments. Experimental results demonstrate that BrightVO achieves state-of-the-art performance on both the KiC4R dataset and the KITTI benchmarks. Specifically, it provides an average improvement of 20% in pose estimation accuracy in normal outdoor environments and 25% in low-light conditions, outperforming existing methods. This work is open-source at https://github.com/Anastasiawd/BrightVO.

PDF IJCAI Semantic Scholar

Cite

Text

Wang et al. "BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-Modality Refinement Module." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/216

Markdown

[Wang et al. "BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-Modality Refinement Module." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/wang2025ijcai-bright/) doi:10.24963/IJCAI.2025/216

BibTeX

@inproceedings{wang2025ijcai-bright,
  title     = {{BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-Modality Refinement Module}},
  author    = {Wang, Dongzhihan and Yang, Yang and Chen, Xuyang and Xu, Liang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {1936-1944},
  doi       = {10.24963/IJCAI.2025/216},
  url       = {https://mlanthology.org/ijcai/2025/wang2025ijcai-bright/}
}