Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Abstract
Category-Agnostic Pose Estimation (CAPE) aims to detect keypoints of an arbitrary unseen category in images, based on several provided examples of that category. This is a challenging task, as the limited data of unseen categories makes it difficult for models to generalize effectively. To address this challenge, previous methods typically train models on a set of predefined base categories with extensive annotations. In this work, we propose to harness rich knowledge in the off-the-shelf text-to-image diffusion model to effectively address CAPE, without training on carefully prepared base categories. To this end, we propose a Prompt Pose Matching (PPM) framework, which learns pseudo prompts corresponding to the keypoints in the provided few-shot examples via the text-to-image diffusion model. These learned pseudo prompts capture semantic information of keypoints, which can then be used to locate the same type of keypoints from images. We also design a Category-shared Prompt Training (CPT) scheme, to further boost our PPM’s performance. Extensive experiments demonstrate the efficacy of our approach.
Cite
Text
Peng et al. "Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72624-8_20Markdown
[Peng et al. "Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/peng2024eccv-harnessing/) doi:10.1007/978-3-031-72624-8_20BibTeX
@inproceedings{peng2024eccv-harnessing,
title = {{Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation}},
author = {Peng, Duo and Zhang, Zhengbo and Hu, Ping and Ke, Qiuhong and Yau, David and Liu, Jun},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72624-8_20},
url = {https://mlanthology.org/eccv/2024/peng2024eccv-harnessing/}
}