Modeling Image Composition for Visual Aesthetic Assessment

Abstract

Composition information is an important cue to characterize the aesthetic property of an image. We propose to model the image composition information as the mutual dependencies of its local regions, and design an architecture to leverage such information to boost aesthetics assessment. We adopt a Fully Convolutional Network (FCN) as the feature encoder of the input image and use the encoded feature map to represent the individual local regions and their spatial layout in the image. Then we build a region composition graph in which each node denotes one region and any two nodes are connected by an edge weighted by the similarity of the region features. We perform reasoning on this graph via graph convolution, in which the activation of each node is determined by its highly correlated neighbors. Our method achieves the state-of-the-art performance on the benchmark visual aesthetic dataset [15].

Cite

Text

Liu et al. "Modeling Image Composition for Visual Aesthetic Assessment." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019. doi:10.1109/CVPRW.2019.00043

Markdown

[Liu et al. "Modeling Image Composition for Visual Aesthetic Assessment." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/liu2019cvprw-modeling/) doi:10.1109/CVPRW.2019.00043

BibTeX

@inproceedings{liu2019cvprw-modeling,
  title     = {{Modeling Image Composition for Visual Aesthetic Assessment}},
  author    = {Liu, Dong and Puri, Rohit and Kamath, Nagendra and Bhattacharya, Subhabrata},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2019},
  pages     = {320-322},
  doi       = {10.1109/CVPRW.2019.00043},
  url       = {https://mlanthology.org/cvprw/2019/liu2019cvprw-modeling/}
}