Self Supervised Scanpath Prediction Framework for Painting Images

Abstract

In our paper, we propose a novel strategy to learn distortion invariant latent representation from painting pictures for visual attention modelling downstream task. In further detail, we design an unsupervised framework that jointly maximises the mutual information over different painting styles. To show the effectiveness of our approach, we firstly propose a lightweight scanpath baseline model and compare its performance to some state-of-the-art methods. Secondly, we train the encoder of our baseline model on large-scale painting images to study the efficiency of the proposed self-supervised strategy. The lightweight decoder proves effective in learning from the self-supervised pre-trained encoder with better performances than the end-to-end fine-tuned supervised baseline on two painting datasets, including a proposed new visual attention modelling dataset.1 2

Cite

Text

Tliba et al. "Self Supervised Scanpath Prediction Framework for Painting Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00160

Markdown

[Tliba et al. "Self Supervised Scanpath Prediction Framework for Painting Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/tliba2022cvprw-self/) doi:10.1109/CVPRW56347.2022.00160

BibTeX

@inproceedings{tliba2022cvprw-self,
  title     = {{Self Supervised Scanpath Prediction Framework for Painting Images}},
  author    = {Tliba, Marouane and Kerkouri, Mohamed Amine and Chetouani, Aladine and Bruno, Alessandro},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {1538-1547},
  doi       = {10.1109/CVPRW56347.2022.00160},
  url       = {https://mlanthology.org/cvprw/2022/tliba2022cvprw-self/}
}