Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling

Abstract

Sign language video retrieval plays a key role in facilitating information access for the deaf community. Despite significant advances in video-text retrieval, the complexity and inherent uncertainty of sign language preclude direct applications of these techniques. Previous methods achieve mapping between sign language videos and text through fine-grained modal alignment. However, due to the scarcity of fine-grained annotations, the uncertainty inherent in sign language videos is underestimated, limiting further development of sign language retrieval tasks. To address this challenge, we propose a new Uncertainty-aware Probability Distribution Retrieval (UPRet), which conceptualizes the mapping process of sign language videos and texts in terms of probability distributions, explores their potential interrelationships, and enables flexible mappings. Experiments on three benchmarks demonstrate the effectiveness of our method, which achieves state-of-the-art results on How2Sign (59.1%), PHOENIX-2014T (72.0%), and CSL-Daily (78.4%). Our source code is available: https://github.com/xua222/ UPRet.

Cite

Text

Wu et al. "Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72784-9_22

Markdown

[Wu et al. "Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/wu2024eccv-uncertaintyaware/) doi:10.1007/978-3-031-72784-9_22

BibTeX

@inproceedings{wu2024eccv-uncertaintyaware,
  title     = {{Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling}},
  author    = {Wu, Xuan and Li, Hongxiang and Luo, Yuanjiang and Cheng, Xuxin and Zhuang, Xianwei and Cao, Meng and Fu, Keren},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72784-9_22},
  url       = {https://mlanthology.org/eccv/2024/wu2024eccv-uncertaintyaware/}
}