DeepDiary: Automatically Captioning Lifelogging Image Streams

Abstract

Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate captions for both individual images and image streams, using temporal consistency constraints to create summaries that are both more compact and less noisy. We evaluate our techniques with quantitative and qualitative results, and apply captioning to an image retrieval application for finding potentially private images. Our results suggest that our automatic captioning algorithms, while imperfect, may work well enough to help users manage lifelogging photo collections.

Cite

Text

Fan and Crandall. "DeepDiary: Automatically Captioning Lifelogging Image Streams." European Conference on Computer Vision Workshops, 2016. doi:10.1007/978-3-319-46604-0_33

Markdown

[Fan and Crandall. "DeepDiary: Automatically Captioning Lifelogging Image Streams." European Conference on Computer Vision Workshops, 2016.](https://mlanthology.org/eccvw/2016/fan2016eccvw-deepdiary/) doi:10.1007/978-3-319-46604-0_33

BibTeX

@inproceedings{fan2016eccvw-deepdiary,
  title     = {{DeepDiary: Automatically Captioning Lifelogging Image Streams}},
  author    = {Fan, Chenyou and Crandall, David J.},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2016},
  pages     = {459-473},
  doi       = {10.1007/978-3-319-46604-0_33},
  url       = {https://mlanthology.org/eccvw/2016/fan2016eccvw-deepdiary/}
}