Making DensePose Fast and Light

Abstract

DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved 17 times model size reduction and 2 times latency improvement compared to the baseline model.

Cite

Text

Rakhimov et al. "Making DensePose Fast and Light." Winter Conference on Applications of Computer Vision, 2021.

Markdown

[Rakhimov et al. "Making DensePose Fast and Light." Winter Conference on Applications of Computer Vision, 2021.](https://mlanthology.org/wacv/2021/rakhimov2021wacv-making/)

BibTeX

@inproceedings{rakhimov2021wacv-making,
  title     = {{Making DensePose Fast and Light}},
  author    = {Rakhimov, Ruslan and Bogomolov, Emil and Notchenko, Alexandr and Mao, Fung and Artemov, Alexey and Zorin, Denis and Burnaev, Evgeny},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2021},
  pages     = {1869-1877},
  url       = {https://mlanthology.org/wacv/2021/rakhimov2021wacv-making/}
}