Boosting Pose Estimators via Cross-Representation Distillation
Abstract
Pose estimation is a computationally intensive task that involves locating the keypoints of the human body in images. Pose estimators are typically categorized into heatmap-based and regression-based models. Heatmap-based models offer strong performance but are computationally heavy, while regression-based models are lightweight but exhibit lower performance. In this paper, we propose Online Cross-Representation Distillation (OCD), leveraging the strengths of heatmap-based methods to enhance the performance of regression-based models while maintaining their lightweight characteristics. We innovatively utilize task loss as the distillation loss and introduce cross-representation head distillation loss, achieving significant performance improvements with just one quick online distillation training session. For example, for regression model (Deeppose) based on ResNet50, we enhance its performance from 52.8 to 62.8 mAP on the COCO Body dataset. Based on OCD, we design Teacher-Aided Cross-Representation Distillation (TCD). TCD utilizes a pretrained heatmap-based teacher to distill a regression-based student. Additionally, we modify OCD to reduce the size of the model’s head. Through distillation, we enable the network to achieve the same performance even when using a downsized head structure. We evaluate the performance of our proposed methods on the COCO Body and COCO WholeBody datasets. Our methods demonstrate significant improvements and exhibit good generalizability. Our codes are available at https://github.com/luckin99/OCD .
Cite
Text
Liu et al. "Boosting Pose Estimators via Cross-Representation Distillation." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91575-8_21Markdown
[Liu et al. "Boosting Pose Estimators via Cross-Representation Distillation." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/liu2024eccvw-boosting/) doi:10.1007/978-3-031-91575-8_21BibTeX
@inproceedings{liu2024eccvw-boosting,
title = {{Boosting Pose Estimators via Cross-Representation Distillation}},
author = {Liu, Kang and Yang, Zhendong and Zhang, Jingyun and Wang, Jun and Wang, Shaoming and Yuan, Chun and Guo, Rizen},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {343-358},
doi = {10.1007/978-3-031-91575-8_21},
url = {https://mlanthology.org/eccvw/2024/liu2024eccvw-boosting/}
}