Efficiency in Real-Time Webcam Gaze Tracking

Abstract

Efficiency and ease of use are essential for practical applications of camera based eye/gaze-tracking. Gaze tracking involves estimating where a person is looking on a screen based on face images from a computer-facing camera. In this paper we investigate two complementary forms of efficiency in gaze tracking: 1. The computational efficiency of the system which is dominated by the inference speed of a CNN predicting gaze-vectors; 2. The usability efficiency which is determined by the tediousness of the mandatory calibration of the gaze-vector to a computer screen. To do so, we evaluate the computational speed/accuracy trade-off for the CNN and the calibration effort/accuracy trade-off for screen calibration. For the CNN, we evaluate the full face, two-eyes, and single eye input. For screen calibration, we measure the number of calibration points needed and evaluate three types of calibration: 1. pure geometry, 2. pure machine learning, and 3. hybrid geometric regression. Results suggest that a single eye input and geometric regression calibration achieve the best trade-off.

Cite

Text

Gudi et al. "Efficiency in Real-Time Webcam Gaze Tracking." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-66415-2_34

Markdown

[Gudi et al. "Efficiency in Real-Time Webcam Gaze Tracking." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/gudi2020eccvw-efficiency/) doi:10.1007/978-3-030-66415-2_34

BibTeX

@inproceedings{gudi2020eccvw-efficiency,
  title     = {{Efficiency in Real-Time Webcam Gaze Tracking}},
  author    = {Gudi, Amogh and Li, Xin and van Gemert, Jan},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2020},
  pages     = {529-543},
  doi       = {10.1007/978-3-030-66415-2_34},
  url       = {https://mlanthology.org/eccvw/2020/gudi2020eccvw-efficiency/}
}