Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings

Abstract

Automatically learning the structure of object categories remains an important open problem in computer vision. We propose a novel unsupervised approach that can discover and learn to detect landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object articulation, by learning a deep neural network that detects landmarks compatible with such visual effects. We show that, by requiring the same neural network to be applicable to different object instances, our method naturally induces meaningful correspondences between different object instances in a category. We assess the method qualitatively on a variety of object types, natural an man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in faces benchmark datasets, and can be used to regress those with a high degree of accuracy.

Cite

Text

Thewlis et al. "Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.348

Markdown

[Thewlis et al. "Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/thewlis2017iccv-unsupervised/) doi:10.1109/ICCV.2017.348

BibTeX

@inproceedings{thewlis2017iccv-unsupervised,
  title     = {{Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings}},
  author    = {Thewlis, James and Bilen, Hakan and Vedaldi, Andrea},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.348},
  url       = {https://mlanthology.org/iccv/2017/thewlis2017iccv-unsupervised/}
}