Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition

Abstract

A complex action consists of a sequence of atomic actions that interact with each other over a relatively long period of time. This paper introduces a probabilistic model named Uncertainty-Guided Probabilistic Transformer (UGPT) for complex action recognition. The self-attention mechanism of a Transformer is used to capture the complex and long-term dynamics of the complex actions. By explicitly modeling the distribution of the attention scores, we extend the deterministic Transformer to a probabilistic Transformer in order to quantify the uncertainty of the prediction. The model prediction uncertainty is used to improve both training and inference. Specifically, we propose a novel training strategy by introducing a majority model and a minority model based on the epistemic uncertainty. During the inference, the prediction is jointly made by both models through a dynamic fusion strategy. Our method is validated on the benchmark datasets, including Breakfast Actions, MultiTHUMOS, and Charades. The experiment results show that our model achieves the state-of-the-art performance under both sufficient and insufficient data.

Cite

Text

Guo et al. "Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01942

Markdown

[Guo et al. "Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/guo2022cvpr-uncertaintyguided/) doi:10.1109/CVPR52688.2022.01942

BibTeX

@inproceedings{guo2022cvpr-uncertaintyguided,
  title     = {{Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition}},
  author    = {Guo, Hongji and Wang, Hanjing and Ji, Qiang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {20052-20061},
  doi       = {10.1109/CVPR52688.2022.01942},
  url       = {https://mlanthology.org/cvpr/2022/guo2022cvpr-uncertaintyguided/}
}