View from Above: Orthogonal-View Aware Cross-View Localization
Abstract
This paper presents a novel aerial-to-ground feature aggregation strategy tailored for the task of cross-view image-based geo-localization. Conventional vision-based methods heavily rely on matching ground-view image features with a pre-recorded image database often through establishing planar homography correspondences via a planar ground assumption. As such they tend to ignore features that are off-ground and not suited for handling visual occlusions leading to unreliable localization in challenging scenarios. We propose a Top-to-Ground Aggregation module that capitalizes aerial orthographic views to aggregate features down to the ground level leveraging reliable off-ground information to improve feature alignment. Furthermore we introduce a Cycle Domain Adaptation loss that ensures feature extraction robustness across domain changes. Additionally an Equidistant Re-projection loss is introduced to equalize the impact of all keypoints on orientation error leading to a more extended distribution of keypoints which benefits orientation estimation. On both KITTI and Ford Multi-AV datasets our method consistently achieves the lowest mean longitudinal and lateral translations across different settings and obtains the smallest orientation error when the initial pose is less accurate a more challenging setting. Further it can complete an entire route through continual vehicle pose estimation with initial vehicle pose given only at the starting point.
Cite
Text
Wang et al. "View from Above: Orthogonal-View Aware Cross-View Localization." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01406Markdown
[Wang et al. "View from Above: Orthogonal-View Aware Cross-View Localization." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/wang2024cvpr-view/) doi:10.1109/CVPR52733.2024.01406BibTeX
@inproceedings{wang2024cvpr-view,
title = {{View from Above: Orthogonal-View Aware Cross-View Localization}},
author = {Wang, Shan and Nguyen, Chuong and Liu, Jiawei and Zhang, Yanhao and Muthu, Sundaram and Maken, Fahira Afzal and Zhang, Kaihao and Li, Hongdong},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {14843-14852},
doi = {10.1109/CVPR52733.2024.01406},
url = {https://mlanthology.org/cvpr/2024/wang2024cvpr-view/}
}