Towards Unsupervised Eye-Region Segmentation for Eye Tracking
Abstract
Finding the eye and parsing out the parts (e.g. pupil and iris) is a key prerequisite for image-based eye tracking, which has become an indispensable module in today’s head-mounted VR/AR devices. However, a typical route for training a segmenter requires tedious hand-labeling. In this work, we explore an unsupervised way. First, we utilize priors of human eye and extract signals from the image to establish rough clues indicating the eye-region structure. Upon these sparse and noisy clues, a segmentation network is trained to gradually identify the precise area for each part. To achieve accurate parsing of the eye-region, we first leverage the pretrained foundation model Segment Anything (SAM) in an automatic way to refine the eye indications. Then, the learning process is designed in an end-to-end manner following progressive and prior-aware principle. Experiments show that our unsupervised approach can easily achieve 90% (the pupil and iris) and 85% (the whole eye-region) of the performances under supervised learning.
Cite
Text
Deng et al. "Towards Unsupervised Eye-Region Segmentation for Eye Tracking." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91989-3_13Markdown
[Deng et al. "Towards Unsupervised Eye-Region Segmentation for Eye Tracking." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/deng2024eccvw-unsupervised/) doi:10.1007/978-3-031-91989-3_13BibTeX
@inproceedings{deng2024eccvw-unsupervised,
title = {{Towards Unsupervised Eye-Region Segmentation for Eye Tracking}},
author = {Deng, Jiangfan and Jia, Zhuang and Wang, Zhaoxue and Long, Xiang and Du, Daniel K.},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {199-213},
doi = {10.1007/978-3-031-91989-3_13},
url = {https://mlanthology.org/eccvw/2024/deng2024eccvw-unsupervised/}
}