CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Abstract
We introduce CHAMP, a novel method for learning sequence-to-sequence, multi-hypothesis 3D human poses from 2D keypoints by leveraging a conditional distribution with a diffusion model. To predict a single output 3D pose sequence, we generate and aggregate multiple 3D pose hypotheses. For better aggregation results, we develop a method to score these hypotheses during training, effectively integrating conformal prediction into the learning process. This process results in a differentiable conformal predictor that is trained end-to-end with the 3D pose estimator. Post-training, the learned scoring model is used as the conformity score, and the 3D pose estimator is combined with a conformal predictor to select the most accurate hypotheses for downstream aggregation. Our results indicate that using a simple mean aggregation on the conformal prediction-filtered hypotheses set yields competitive results. When integrated with more sophisticated aggregation techniques, our method achieves state-of-the-art performance across various metrics and datasets while inheriting the probabilistic guarantees of conformal prediction.
Cite
Text
Zhang and Carlone. "CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators." International Conference on Learning Representations, 2025.Markdown
[Zhang and Carlone. "CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/zhang2025iclr-champ/)BibTeX
@inproceedings{zhang2025iclr-champ,
title = {{CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators}},
author = {Zhang, Harry and Carlone, Luca},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/zhang2025iclr-champ/}
}