✉ AirLetters 💨: An Open Vide 🎞 Dataset of Characters Drawn in the Air

Abstract

We introduce AirLetters, a new video dataset consisting of real-world videos of human-generated, articulated motions. Specifically, our dataset requires a vision model to predict letters that humans draw in the air. Unlike existing video datasets, accurate classification predictions for AirLetters rely critically on discerning motion patterns and on integrating long-range information in the video over time. An extensive evaluation of state-of-the-art image and video understanding models on AirLetters shows that these methods perform poorly and fall far behind a human baseline. Our work shows that, despite recent progress in end-to-end video understanding, accurate representations of complex articulated motions – a task that is trivial for humans – remains an open problem for end-to-end learning.

Cite

Text

Dagli et al. "✉ AirLetters 💨: An Open Vide 🎞 Dataset of Characters Drawn in the Air." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91578-9_1

Markdown

[Dagli et al. "✉ AirLetters 💨: An Open Vide 🎞 Dataset of Characters Drawn in the Air." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/dagli2024eccvw-airletters/) doi:10.1007/978-3-031-91578-9_1

BibTeX

@inproceedings{dagli2024eccvw-airletters,
  title     = {{✉ AirLetters 💨: An Open Vide 🎞 Dataset of Characters Drawn in the Air}},
  author    = {Dagli, Rishit and Berger, Guillaume and Materzynska, Joanna and Bax, Ingo and Memisevic, Roland},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {1-32},
  doi       = {10.1007/978-3-031-91578-9_1},
  url       = {https://mlanthology.org/eccvw/2024/dagli2024eccvw-airletters/}
}