HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization

Abstract

Online video understanding often relies on individual frames, leading to frame-by-frame predictions. Recent advancements such as Online Temporal Action Localization (OnTAL), extend this approach to instance-level predictions. However, existing methods mainly focus on short-term context, neglecting historical information. To address this, we introduce the History-Augmented Anchor Transformer (HAT) Framework for OnTAL. By integrating historical context, our framework enhances the synergy between long-term and short-term information, improving the quality of anchor features crucial for classification and localization. We evaluate our model on both procedural egocentric (PREGO) datasets (EGTEA and EPIC) and standard non-PREGO OnTAL datasets (THUMOS and MUSES). Results show that our model outperforms state-of-the-art approaches significantly on PREGO datasets and achieves comparable or slightly superior performance on non-PREGO datasets, underscoring the importance of leveraging long-term history, especially in procedural and egocentric action scenarios. Code is available at: https://github.com/sakibreza/ECCV24-HAT/.

Cite

Text

Reza et al. "HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72664-4_12

Markdown

[Reza et al. "HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/reza2024eccv-hat/) doi:10.1007/978-3-031-72664-4_12

BibTeX

@inproceedings{reza2024eccv-hat,
  title     = {{HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization}},
  author    = {Reza, Sakib and Zhang, Yuexi and Moghaddam, Mohsen and Camps, Octavia},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72664-4_12},
  url       = {https://mlanthology.org/eccv/2024/reza2024eccv-hat/}
}