Fine-Tuning Human Pose Estimations in Videos
Abstract
We propose a semi-supervised self-training method for fine-tuning human pose estimations in videos that provides accurate estimations even for complex sequences. We surpass state-of-the-art on most of the datasets used and also show a 2.33% gain over the baseline on our new dataset of unrestricted sports videos. The self-training model presented has two components: a static Pictorial Structure (PS) based model and a dynamic ensemble of exemplars. We present a pose quality criteria that is primarily used for batch selection and automatic parameter selection. The same criteria works as a low-level pose evaluator used in post-processing. We set a new challenge by introducing a full human body-parts annotated complex dataset, CVIT-SPORTS, which contains complex videos from the sports domain. The strength of our method is demonstrated by adapting to videos of complex activities such as cricket-bowling, cricket-batting, football as well as available standard datasets.
Cite
Text
Singh et al. "Fine-Tuning Human Pose Estimations in Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477680Markdown
[Singh et al. "Fine-Tuning Human Pose Estimations in Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/singh2016wacv-fine/) doi:10.1109/WACV.2016.7477680BibTeX
@inproceedings{singh2016wacv-fine,
title = {{Fine-Tuning Human Pose Estimations in Videos}},
author = {Singh, Digvijay and Balasubramanian, Vineeth and Jawahar, C. V.},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2016},
pages = {1-9},
doi = {10.1109/WACV.2016.7477680},
url = {https://mlanthology.org/wacv/2016/singh2016wacv-fine/}
}