SentiCap: Generating Image Descriptions with Sentiments

Abstract

The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.

Cite

Text

Mathews et al. "SentiCap: Generating Image Descriptions with Sentiments." AAAI Conference on Artificial Intelligence, 2016. doi:10.1609/AAAI.V30I1.10475

Markdown

[Mathews et al. "SentiCap: Generating Image Descriptions with Sentiments." AAAI Conference on Artificial Intelligence, 2016.](https://mlanthology.org/aaai/2016/mathews2016aaai-senticap/) doi:10.1609/AAAI.V30I1.10475

BibTeX

@inproceedings{mathews2016aaai-senticap,
  title     = {{SentiCap: Generating Image Descriptions with Sentiments}},
  author    = {Mathews, Alexander Patrick and Xie, Lexing and He, Xuming},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2016},
  pages     = {3574-3580},
  doi       = {10.1609/AAAI.V30I1.10475},
  url       = {https://mlanthology.org/aaai/2016/mathews2016aaai-senticap/}
}