MixANT: Observation-Dependent Memory Propagation for Stochastic Dense Action Anticipation
Abstract
We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate (A matrix) controlling temporal memory remains static. We address this limitation by introducing a mixture of experts approach that dynamically selects contextually relevant A matrices based on input features, enhancing representational capacity without sacrificing computational efficiency. Extensive experiments on the 50Salads, Breakfast, and Assembly101 datasets demonstrate that MixANT consistently outperforms state-of-the-art methods across all evaluation settings. Our results highlight the importance of input-dependent forget-gate mechanisms for reliable prediction of human behavior in diverse real-world scenarios. The project page is available at https://talalwasim.github.io/MixANT/.
Cite
Text
Wasim et al. "MixANT: Observation-Dependent Memory Propagation for Stochastic Dense Action Anticipation." International Conference on Computer Vision, 2025.Markdown
[Wasim et al. "MixANT: Observation-Dependent Memory Propagation for Stochastic Dense Action Anticipation." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/wasim2025iccv-mixant/)BibTeX
@inproceedings{wasim2025iccv-mixant,
title = {{MixANT: Observation-Dependent Memory Propagation for Stochastic Dense Action Anticipation}},
author = {Wasim, Syed Talal and Suleman, Hamid and Zatsarynna, Olga and Naseer, Muzammal and Gall, Juergen},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {14613-14622},
url = {https://mlanthology.org/iccv/2025/wasim2025iccv-mixant/}
}