Searching Efficient Neural Architecture with Multi-Resolution Fusion Transformer for Appearance-Based Gaze Estimation

Abstract

For aiming at a more accurate appearance-based gaze estimation, a series of recent works propose to use transformers or high-resolution networks in several ways which achieve state-of-the-art, but such works lack efficiency for real-time applications on edge computing devices. In this paper, we propose a compact model to precisely and efficiently solve gaze estimation. The proposed model includes 1) a Neural Architecture Search(NAS)-based multi-resolution feature extractor for extracting feature maps with global and local information which are essential for this task and 2) a novel multi-resolution fusion transformer as the gaze estimation head for efficiently estimating gaze values by fusing the extracted feature maps. We search our proposed model, called GazeNAS-ETH, on the ETH-XGaze dataset. We confirmed through experiments that GazeNAS-ETH achieved state-of-the-art on Gaze360, MPIIFaceGaze, RTGENE, and EYEDIAP datasets, while having only about 1M parameters and using only 0.28 GFLOPs, which is significantly less compared to previous state-of-the-art models, making it easier to deploy for real-time applications.

Cite

Text

Nagpure and Okuma. "Searching Efficient Neural Architecture with Multi-Resolution Fusion Transformer for Appearance-Based Gaze Estimation." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Nagpure and Okuma. "Searching Efficient Neural Architecture with Multi-Resolution Fusion Transformer for Appearance-Based Gaze Estimation." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/nagpure2023wacv-searching/)

BibTeX

@inproceedings{nagpure2023wacv-searching,
  title     = {{Searching Efficient Neural Architecture with Multi-Resolution Fusion Transformer for Appearance-Based Gaze Estimation}},
  author    = {Nagpure, Vikrant and Okuma, Kenji},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {890-899},
  url       = {https://mlanthology.org/wacv/2023/nagpure2023wacv-searching/}
}