Fast Forwarding Egocentric Videos by Listening and Watching

Abstract

The remarkable technological advance in well-equipped wearable devices is pushing an increasing production of long first-person videos. However, since most of these videos have long and tedious parts, they are forgotten or never seen. Despite a large number of techniques proposed to fast-forward these videos by highlighting relevant moments, most of them are image based only. Most of these techniques disregard other relevant sensors present in the current devices such as high-definition microphones. In this work, we propose a new approach to fast-forward videos using psychoacoustic metrics extracted from the soundtrack. These metrics can be used to estimate the annoyance of a segment allowing our method to emphasize moments of sound pleasantness. The efficiency of our method is demonstrated through qualitative results and quantitative results as far as of speed-up and instability are concerned.

Cite

Text

Furlan et al. "Fast Forwarding Egocentric Videos by Listening and Watching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.

Markdown

[Furlan et al. "Fast Forwarding Egocentric Videos by Listening and Watching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.](https://mlanthology.org/cvprw/2018/furlan2018cvprw-fast/)

BibTeX

@inproceedings{furlan2018cvprw-fast,
  title     = {{Fast Forwarding Egocentric Videos by Listening and Watching}},
  author    = {Furlan, Vinicius Signori and Bajcsy, Ruzena and Nascimento, Erickson R.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2018},
  pages     = {2504-2507},
  url       = {https://mlanthology.org/cvprw/2018/furlan2018cvprw-fast/}
}