MIDI-Assisted Egocentric Optical Music Recognition

Abstract

Egocentric vision has received increasing attention in recent years due to the vast development of wearable devices and their applications. Although there are numerous existing work on egocentric vision, none of them solve Optical Music Recognition (OMR) problem. In this paper, we propose a novel optical music recognition approach for egocentric device (e.g. Google Glass) with the assistance of MIDI data. We formulate the problem as a structured sequence alignment problem as opposed to the blind recognition in traditional OMR systems. We propose a linear-chain Conditional Random Field (CRF) to model the note event sequence, which translates the relative temporal relations contained by MIDI to spatial constraints over the egocentric observation. We performed evaluations to compare the proposed approach with several different baselines and proved that our approach achieved the highest recognition accuracy. We view our work as the first step towards egocentric optical music recognition, and believe it will bring insights for next-generation music pedagogy and music entertainment.

Cite

Text

Chen and Duan. "MIDI-Assisted Egocentric Optical Music Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477714

Markdown

[Chen and Duan. "MIDI-Assisted Egocentric Optical Music Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/chen2016wacv-midi/) doi:10.1109/WACV.2016.7477714

BibTeX

@inproceedings{chen2016wacv-midi,
  title     = {{MIDI-Assisted Egocentric Optical Music Recognition}},
  author    = {Chen, Liang and Duan, Kun},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2016},
  pages     = {1-9},
  doi       = {10.1109/WACV.2016.7477714},
  url       = {https://mlanthology.org/wacv/2016/chen2016wacv-midi/}
}