Observing Pianist Accuracy and Form with Computer Vision
Abstract
We present a first step towards developing an interactive piano tutoring system that can observe a student playing the piano and give feedback about hand movements and musical accuracy. In particular, we have two primary aims: 1) to determine which notes on a piano are being played at any moment in time, 2) to identify which finger is pressing each note. We introduce a novel two-stream convolutional neural network that takes video and audio inputs together for detecting pressed notes and finger presses. We formulate our two problems in terms of multi-task learning and extend a state-of-the-art object detection model to incorporate both audio and visual features. In addition, we introduce a novel finger identification solution based on pressed piano note information. We experimentally confirm that our approach is able to detect pressed piano keys and the piano player's fingers with a high accuracy.
Cite
Text
Lee et al. "Observing Pianist Accuracy and Form with Computer Vision." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00165Markdown
[Lee et al. "Observing Pianist Accuracy and Form with Computer Vision." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/lee2019wacv-observing/) doi:10.1109/WACV.2019.00165BibTeX
@inproceedings{lee2019wacv-observing,
title = {{Observing Pianist Accuracy and Form with Computer Vision}},
author = {Lee, Jangwon and Doosti, Bardia and Gu, Yupeng and Cartledge, David and Crandall, David J. and Raphael, Christopher},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2019},
pages = {1505-1513},
doi = {10.1109/WACV.2019.00165},
url = {https://mlanthology.org/wacv/2019/lee2019wacv-observing/}
}