MambaPupil: Bidirectional Selective Recurrent Model for Event-Based Eye Tracking
Abstract
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy provided by the event camera. However, the diversity and abruptness of eye movement patterns, including blinking, fixating, saccades, and smooth pursuit, pose significant challenges for eye localization. To achieve a stable event-based eye-tracking system, this paper proposes a bidirectional long-term sequence modeling and time-varying state selection mechanism to fully utilize contextual temporal information in response to the variability of eye movements. Specifically, the MambaPupil network is proposed, which consists of the multi-layer convolutional encoder to extract features from the event representations, a bidirectional Gated Recurrent Unit (GRU), and a Linear Time-Varying State Space Module (LTV-SSM), to selectively capture contextual correlation from the forward and backward temporal relationship. Furthermore, the Bina-rep is utilized as a compact event representation, and the tailor-made data augmentation, called as Event-Cutout, is proposed to enhance the model’s robustness by applying spatial random masking to the event image. The evaluation of the ThreeET-plus benchmark shows that the MambaPupil realizes stable and accurate eye tracking under various complex conditions and achieves state-of-the-art performance.
Cite
Text
Wang et al. "MambaPupil: Bidirectional Selective Recurrent Model for Event-Based Eye Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00585Markdown
[Wang et al. "MambaPupil: Bidirectional Selective Recurrent Model for Event-Based Eye Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/wang2024cvprw-mambapupil/) doi:10.1109/CVPRW63382.2024.00585BibTeX
@inproceedings{wang2024cvprw-mambapupil,
title = {{MambaPupil: Bidirectional Selective Recurrent Model for Event-Based Eye Tracking}},
author = {Wang, Zhong and Wan, Zengyu and Han, Han and Liao, Bohao and Wu, Yuliang and Zhai, Wei and Cao, Yang and Zha, Zheng-Jun},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {5762-5770},
doi = {10.1109/CVPRW63382.2024.00585},
url = {https://mlanthology.org/cvprw/2024/wang2024cvprw-mambapupil/}
}