EAGLE: Efficient Adaptive Geometry-Based Learning in Cross-View Understanding
Abstract
Unsupervised Domain Adaptation has been an efficient approach to transferring the semantic segmentation model across data distributions. Meanwhile, the recent Open-vocabulary Semantic Scene understanding based on large-scale vision language models is effective in open-set settings because it can learn diverse concepts and categories. However, these prior methods fail to generalize across different camera views due to the lack of cross-view geometric modeling. At present, there are limited studies analyzing cross-view learning. To address this problem, we introduce a novel Unsupervised Cross-view Adaptation Learning approach to modeling the geometric structural change across views in Semantic Scene Understanding. First, we introduce a novel Cross-view Geometric Constraint on Unpaired Data to model structural changes in images and segmentation masks across cameras. Second, we present a new Geodesic Flow-based Correlation Metric to efficiently measure the geometric structural changes across camera views. Third, we introduce a novel view-condition prompting mechanism to enhance the view-information modeling of the open-vocabulary segmentation network in cross-view adaptation learning. The experiments on different cross-view adaptation benchmarks have shown the effectiveness of our approach in cross-view modeling, demonstrating that we achieve State-of-the-Art (SOTA) performance compared to prior unsupervised domain adaptation and open-vocabulary semantic segmentation methods.
Cite
Text
Truong et al. "EAGLE: Efficient Adaptive Geometry-Based Learning in Cross-View Understanding." Neural Information Processing Systems, 2024. doi:10.52202/079017-4362Markdown
[Truong et al. "EAGLE: Efficient Adaptive Geometry-Based Learning in Cross-View Understanding." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/truong2024neurips-eagle/) doi:10.52202/079017-4362BibTeX
@inproceedings{truong2024neurips-eagle,
title = {{EAGLE: Efficient Adaptive Geometry-Based Learning in Cross-View Understanding}},
author = {Truong, Thanh-Dat and Prabhu, Utsav and Wang, Dongyi and Raj, Bhiksha and Gauch, Susan and Subbiah, Jeyamkondan and Luu, Khoa},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-4362},
url = {https://mlanthology.org/neurips/2024/truong2024neurips-eagle/}
}