Differentiable Grammars for Videos

Abstract

This paper proposes a novel algorithm which learns a formal regular grammar from real-world continuous data, such as videos. Learning latent terminals, non-terminals, and production rules directly from continuous data allows the construction of a generative model capturing sequential structures with multiple possibilities. Our model is fully differentiable, and provides easily interpretable results which are important in order to understand the learned structures. It outperforms the state-of-the-art on several challenging datasets and is more accurate for forecasting future activities in videos. We plan to open-source the code.1

Cite

Text

Piergiovanni et al. "Differentiable Grammars for Videos." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I07.6861

Markdown

[Piergiovanni et al. "Differentiable Grammars for Videos." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/piergiovanni2020aaai-differentiable/) doi:10.1609/AAAI.V34I07.6861

BibTeX

@inproceedings{piergiovanni2020aaai-differentiable,
  title     = {{Differentiable Grammars for Videos}},
  author    = {Piergiovanni, A. J. and Angelova, Anelia and Ryoo, Michael S.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {11874-11881},
  doi       = {10.1609/AAAI.V34I07.6861},
  url       = {https://mlanthology.org/aaai/2020/piergiovanni2020aaai-differentiable/}
}