OmniGaze: Reward-Inspired Generalizable Gaze Estimation in the Wild

Abstract

Current 3D gaze estimation methods struggle to generalize across diverse data domains, primarily due to $\textbf{i)}$ $\textit{the scarcity of annotated datasets}$, and $\textbf{ii)}$ $\textit{the insufficient diversity of labeled data}$. In this work, we present OmniGaze, a semi-supervised framework for 3D gaze estimation, which utilizes large-scale unlabeled data collected from diverse and unconstrained real-world environments to mitigate domain bias and generalize gaze estimation in the wild. First, we build a diverse collection of unlabeled facial images, varying in facial appearances, background environments, illumination conditions, head poses, and eye occlusions. In order to leverage unlabeled data spanning a broader distribution, OmniGaze adopts a standard pseudo-labeling strategy and devises a reward model to assess the reliability of pseudo labels. Beyond pseudo labels as 3D direction vectors, the reward model also incorporates visual embeddings extracted by an off-the-shelf visual encoder and semantic cues from gaze perspective generated by prompting a Multimodal Large Language Model to compute confidence scores. Then, these scores are utilized to select high-quality pseudo labels and weight them for loss computation. Extensive experiments demonstrate that OmniGaze achieves state-of-the-art performance on five datasets under both in-domain and cross-domain settings. Furthermore, we also evaluate the efficacy of OmniGaze as a scalable data engine for gaze estimation, which exhibits robust zero-shot generalization on four unseen datasets.

Cite

Text

Qu et al. "OmniGaze: Reward-Inspired Generalizable Gaze Estimation in the Wild." Advances in Neural Information Processing Systems, 2025.

Markdown

[Qu et al. "OmniGaze: Reward-Inspired Generalizable Gaze Estimation in the Wild." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/qu2025neurips-omnigaze/)

BibTeX

@inproceedings{qu2025neurips-omnigaze,
  title     = {{OmniGaze: Reward-Inspired Generalizable Gaze Estimation in the Wild}},
  author    = {Qu, Hongyu and Wei, Jianan and Shu, Xiangbo and Yao, Yazhou and Wang, Wenguan and Tang, Jinhui},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/qu2025neurips-omnigaze/}
}