Optimal Transformation Estimation with Semantic Cues

Abstract

This paper addresses the problem of estimating the geometric transformation relating two distinct visual modalities (e.g. an image and a map, or a projective structure and a Euclidean 3D model) while relying only on semantic cues, such as semantically segmented regions or object bounding boxes. The proposed approach differs from the traditional feature-to-feature correspondence reasoning: starting from semantic regions on one side, we seek their possible corresponding regions on the other, thus constraining the sought geometric transformation. This entails a simultaneous search for the transformation and for the region-to-region correspondences.This paper is the first to derive the conditions that must be satisfied for a convex region, defined by control points, to be transformed inside an ellipsoid. These conditions are formulated as Linear Matrix Inequalities and used within a Branch-and-Prune search to obtain the globally optimal transformation. We tested our approach, under mild initial bound conditions, on two challenging registration problems for aligning: (i) a semantically segmented image and a map via a 2D homography; (ii) a projective 3D structure and its Euclidean counterpart.

Cite

Text

Paudel et al. "Optimal Transformation Estimation with Semantic Cues." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.499

Markdown

[Paudel et al. "Optimal Transformation Estimation with Semantic Cues." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/paudel2017iccv-optimal/) doi:10.1109/ICCV.2017.499

BibTeX

@inproceedings{paudel2017iccv-optimal,
  title     = {{Optimal Transformation Estimation with Semantic Cues}},
  author    = {Paudel, Danda Pani and Habed, Adlane and Van Gool, Luc},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.499},
  url       = {https://mlanthology.org/iccv/2017/paudel2017iccv-optimal/}
}