ScanDMM: A Deep Markov Model of Scanpath Prediction for 360deg Images

Abstract

Scanpath prediction for 360deg images aims to produce dynamic gaze behaviors based on the human visual perception mechanism. Most existing scanpath prediction methods for 360deg images do not give a complete treatment of the time-dependency when predicting human scanpath, resulting in inferior performance and poor generalizability. In this paper, we present a scanpath prediction method for 360deg images by designing a novel Deep Markov Model (DMM) architecture, namely ScanDMM. We propose a semantics-guided transition function to learn the nonlinear dynamics of time-dependent attentional landscape. Moreover, a state initialization strategy is proposed by considering the starting point of viewing, enabling the model to learn the dynamics with the correct "launcher". We further demonstrate that our model achieves state-of-the-art performance on four 360deg image databases, and exhibit its generalizability by presenting two applications of applying scanpath prediction models to other visual tasks - saliency detection and image quality assessment, expecting to provide profound insights into these fields.

Cite

Text

Sui et al. "ScanDMM: A Deep Markov Model of Scanpath Prediction for 360deg Images." Conference on Computer Vision and Pattern Recognition, 2023.

Markdown

[Sui et al. "ScanDMM: A Deep Markov Model of Scanpath Prediction for 360deg Images." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/sui2023cvpr-scandmm/)

BibTeX

@inproceedings{sui2023cvpr-scandmm,
  title     = {{ScanDMM: A Deep Markov Model of Scanpath Prediction for 360deg Images}},
  author    = {Sui, Xiangjie and Fang, Yuming and Zhu, Hanwei and Wang, Shiqi and Wang, Zhou},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {6989-6999},
  url       = {https://mlanthology.org/cvpr/2023/sui2023cvpr-scandmm/}
}