Uncertainty-Based Spatial-Temporal Attention for Online Action Detection
Abstract
Online action detection aims at detecting the ongoing action in a streaming video. In this paper, we proposed an uncertainty-based spatial-temporal attention for online action detection. By explicitly modeling the distribution of model parameters, we extend the baseline models in a probabilistic manner. Then we quantify the predictive uncertainty and use it to generate spatial-temporal attention that focus on large mutual information regions and frames. For inference, we introduce a two-stream framework that combines the baseline model and the probabilistic model based on the input uncertainty. We validate the effectiveness of our method on three benchmark datasets: THUMOS-14, TVSeries, and HDD. Furthermore, we demonstrate that our method generalizes better under different views and occlusions, and is more robust when training with small-scale data.
Cite
Text
Guo et al. "Uncertainty-Based Spatial-Temporal Attention for Online Action Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19772-7_5Markdown
[Guo et al. "Uncertainty-Based Spatial-Temporal Attention for Online Action Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/guo2022eccv-uncertaintybased/) doi:10.1007/978-3-031-19772-7_5BibTeX
@inproceedings{guo2022eccv-uncertaintybased,
title = {{Uncertainty-Based Spatial-Temporal Attention for Online Action Detection}},
author = {Guo, Hongji and Ren, Zhou and Wu, Yi and Hua, Gang and Ji, Qiang},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19772-7_5},
url = {https://mlanthology.org/eccv/2022/guo2022eccv-uncertaintybased/}
}