THP3D: Text-Driven Multi-Granularity 3D Human Parsing

Suzuki, Keito; Du, Bang; Chen, Kunyao; Li, Runfa Blark; Nguyen, Truong Q.

doi:10.1007/978-3-031-91575-8_4

THP3D: Text-Driven Multi-Granularity 3D Human Parsing

Keito Suzuki, Bang Du, Kunyao Chen, Runfa Blark Li, Truong Q. Nguyen

ECCVW 2024 pp. 53-70

doi:10.1007/978-3-031-91575-8_4 /eccvw/2024/suzuki2024eccvw-thp3d/

Abstract

Current methods for segmenting 3D human data usually rely on training with just one specific dataset. As 3D human datasets can differ greatly in terms of their class labels, content diversity, and overall quality, a model trained on one dataset may not generalize well to others. A conventional way to address the challenge is to manually unify the datasets’ classes into a single taxonomy for training. However, the adaptability of models is confined to the classes present in their initial training sets, which restricts their scalability with the introduction of new data. Additionally, a universal set of labels that satisfies all possible downstream applications remains elusive. To tackle these challenges, we present THP3D, a general 3D human parsing method enabling multi-granularity training and inference. Our model can accumulate knowledge from all datasets without manual label unification and supports arbitrary segmentation classes through user text inputs. To achieve this, we construct a new dataset (The dataset will be released to the public.) that augments the THuman2.0 dataset with highly detailed labels. It offers 12 labels that encapsulate both garment and chiral body part information, finer than existing ones. Experiments on various datasets show that our model demonstrates both strong performance and the ability to segment across various granularities.

PDF ECCVW Semantic Scholar

Cite

Text

Suzuki et al. "THP3D: Text-Driven Multi-Granularity 3D Human Parsing." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91575-8_4

Markdown

[Suzuki et al. "THP3D: Text-Driven Multi-Granularity 3D Human Parsing." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/suzuki2024eccvw-thp3d/) doi:10.1007/978-3-031-91575-8_4

BibTeX

@inproceedings{suzuki2024eccvw-thp3d,
  title     = {{THP3D: Text-Driven Multi-Granularity 3D Human Parsing}},
  author    = {Suzuki, Keito and Du, Bang and Chen, Kunyao and Li, Runfa Blark and Nguyen, Truong Q.},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {53-70},
  doi       = {10.1007/978-3-031-91575-8_4},
  url       = {https://mlanthology.org/eccvw/2024/suzuki2024eccvw-thp3d/}
}