VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space
Abstract
Previous works on Human Pose and Shape Estimation (HPSE) from RGB images can be broadly categorized into two main groups: parametric and non-parametric approaches. Parametric techniques leverage a low-dimensional statistical body model for realistic results, whereas recent non-parametric methods achieve higher precision by directly regressing the 3D coordinates of the human body mesh. This work introduces a novel paradigm to address the HPSE problem, involving a low-dimensional discrete latent representation of the human mesh and framing HPSE as a classification task. Instead of predicting body model parameters or 3D vertex coordinates, we focus on predicting the proposed discrete latent representation, which can be decoded into a registered human mesh. This innovative paradigm offers two key advantages. Firstly, predicting a low-dimensional discrete representation confines our predictions to the space of anthropomorphic poses and shapes even when little training data is available. Secondly, by framing the problem as a classification task, we can harness the discriminative power inherent in neural networks. The proposed model, VQ-HPS, predicts the discrete latent representation of the mesh. The experimental results demonstrate that VQ-HPS outperforms the current state-of-the-art non-parametric approaches while yielding results as realistic as those produced by parametric methods when trained with few data. VQ-HPS also shows promising results when training on large-scale datasets, highlighting the significant potential of the classification approach for HPSE. See the project page at https://g-fiche.github.io/research-pages/vqhps/.
Cite
Text
Fiche et al. "VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72943-0_27Markdown
[Fiche et al. "VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/fiche2024eccv-vqhps/) doi:10.1007/978-3-031-72943-0_27BibTeX
@inproceedings{fiche2024eccv-vqhps,
title = {{VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space}},
author = {Fiche, Guénolé and Leglaive, Simon and Alameda-Pineda, Xavier and Agudo, Antonio and Moreno, Francesc},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72943-0_27},
url = {https://mlanthology.org/eccv/2024/fiche2024eccv-vqhps/}
}