MetaCLUE: Towards Comprehensive Visual Metaphors Research

Abstract

Creativity is an indispensable part of human cognition and also an inherent part of how we make sense of the world. Metaphorical abstraction is fundamental in communicating creative ideas through nuanced relationships between abstract concepts such as feelings. While computer vision benchmarks and approaches predominantly focus on understanding and generating literal interpretations of images, metaphorical comprehension of images remains relatively unexplored. Towards this goal, we introduce MetaCLUE, a set of vision tasks on visual metaphor. We also collect high-quality and rich metaphor annotations (abstract objects, concepts, relationships along with their corresponding object boxes) as there do not exist any datasets that facilitate the evaluation of these tasks. We perform a comprehensive analysis of state-of-the-art models in vision and language based on our annotations, highlighting strengths and weaknesses of current approaches in visual metaphor Classification, Localization, Understanding (retrieval, question answering, captioning) and gEneration (text-to-image synthesis) tasks. We hope this work provides a concrete step towards systematically developing AI systems with human-like creative capabilities. Project page: https://metaclue.github.io

Cite

Text

Akula et al. "MetaCLUE: Towards Comprehensive Visual Metaphors Research." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02222

Markdown

[Akula et al. "MetaCLUE: Towards Comprehensive Visual Metaphors Research." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/akula2023cvpr-metaclue/) doi:10.1109/CVPR52729.2023.02222

BibTeX

@inproceedings{akula2023cvpr-metaclue,
  title     = {{MetaCLUE: Towards Comprehensive Visual Metaphors Research}},
  author    = {Akula, Arjun R. and Driscoll, Brendan and Narayana, Pradyumna and Changpinyo, Soravit and Jia, Zhiwei and Damle, Suyash and Pruthi, Garima and Basu, Sugato and Guibas, Leonidas and Freeman, William T. and Li, Yuanzhen and Jampani, Varun},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {23201-23211},
  doi       = {10.1109/CVPR52729.2023.02222},
  url       = {https://mlanthology.org/cvpr/2023/akula2023cvpr-metaclue/}
}