ECO: Efficient Convolutional Network for Online Video Understanding

Abstract

The state of the art in video understanding suffers from two problems: (1) The major part of reasoning is performed locally in the video, thus missing important relationships within actions that span several seconds. (2) While there are local methods with fast per-frame processing, the processing of the whole video is not efficient and hampers fast video retrieval or online classification of long-term activities. In this paper, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time. The architecture is based on merging long-term content already in the network rather than in a post-hoc fusion. Together with a sampling strategy, which exploits that neighboring frames are largely redundant, this yields high-quality action classification and video captioning at up to 230 videos per second, where each video can consist of a few hundred frames. The approach achieves competitive performance across all datasets while being 10x to 80x faster than state-of-the-art methods.

Cite

Text

Zolfaghari et al. "ECO: Efficient Convolutional Network for Online Video Understanding." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01216-8_43

Markdown

[Zolfaghari et al. "ECO: Efficient Convolutional Network for Online Video Understanding." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/zolfaghari2018eccv-eco/) doi:10.1007/978-3-030-01216-8_43

BibTeX

@inproceedings{zolfaghari2018eccv-eco,
  title     = {{ECO: Efficient Convolutional Network for Online Video Understanding}},
  author    = {Zolfaghari, Mohammadreza and Singh, Kamaljeet and Brox, Thomas},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2018},
  doi       = {10.1007/978-3-030-01216-8_43},
  url       = {https://mlanthology.org/eccv/2018/zolfaghari2018eccv-eco/}
}