UnweaveNet: Unweaving Activity Stories

Abstract

Our lives can be seen as a complex weaving of activities; we switch from one activity to another, to maximise our achievements or in reaction to demands placed upon us. Observing a video of unscripted daily activities, we parse the video into its constituent activity threads through a process we call unweaving. To accomplish this, we introduce a video representation explicitly capturing activity threads called a thread bank, along with a neural controller capable of detecting goal changes and continuations of past activities, together forming UnweaveNet. We train and evaluate UnweaveNet on sequences from the unscripted egocentric dataset EPIC-KITCHENS. We propose and showcase the efficacy of pretraining UnweaveNet in a self-supervised manner.

Cite

Text

Price et al. "UnweaveNet: Unweaving Activity Stories." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01340

Markdown

[Price et al. "UnweaveNet: Unweaving Activity Stories." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/price2022cvpr-unweavenet/) doi:10.1109/CVPR52688.2022.01340

BibTeX

@inproceedings{price2022cvpr-unweavenet,
  title     = {{UnweaveNet: Unweaving Activity Stories}},
  author    = {Price, Will and Vondrick, Carl and Damen, Dima},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {13770-13779},
  doi       = {10.1109/CVPR52688.2022.01340},
  url       = {https://mlanthology.org/cvpr/2022/price2022cvpr-unweavenet/}
}