Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

Abstract

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 "take-apart" toy vehicles. Participants work without fixed instructions, and the sequences feature rich and natural variations in action ordering, mistakes, and corrections. Assembly101 is the first multi-view action dataset, with simultaneous static (8) and egocentric (4) recordings. Sequences are annotated with more than 100K coarse and 1M fine-grained action segments, and 18M 3D hand poses. We benchmark on three action understanding tasks: recognition, anticipation and temporal segmentation. Additionally, we propose a novel task of detecting mistakes. The unique recording format and rich set of annotations allow us to investigate generalization to new toys, cross-view transfer, long-tailed distributions, and pose vs. appearance. We envision that Assembly101 will serve as a new challenge to investigate various activity understanding problems.

Cite

Text

Sener et al. "Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.02042

Markdown

[Sener et al. "Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/sener2022cvpr-assembly101/) doi:10.1109/CVPR52688.2022.02042

BibTeX

@inproceedings{sener2022cvpr-assembly101,
  title     = {{Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities}},
  author    = {Sener, Fadime and Chatterjee, Dibyadip and Shelepov, Daniel and He, Kun and Singhania, Dipika and Wang, Robert and Yao, Angela},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {21096-21106},
  doi       = {10.1109/CVPR52688.2022.02042},
  url       = {https://mlanthology.org/cvpr/2022/sener2022cvpr-assembly101/}
}