EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

Abstract

With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed. However, most current research is built on resources derived from third-person video action recognition. This inherent domain gap between first- and third-person action videos, which have not been adequately addressed before, makes current Ego-HOI suboptimal. This paper rethinks and proposes a new framework as an infrastructure to advance Ego-HOI recognition by Probing, Curation and Adaption (EgoPCA). We contribute comprehensive pre-train sets, balanced test sets and a new baseline, which are complete with a training-finetuning strategy. With our new framework, we not only achieve state-of-the-art performance on Ego-HOI benchmarks but also build several new and effective mechanisms and settings to advance further research. We believe our data and the findings will pave a new way for Ego-HOI understanding. Code and data are available at https://mvig-rhos.com/ego_pca.

Cite

Text

Xu et al. "EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00486

Markdown

[Xu et al. "EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/xu2023iccv-egopca/) doi:10.1109/ICCV51070.2023.00486

BibTeX

@inproceedings{xu2023iccv-egopca,
  title     = {{EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding}},
  author    = {Xu, Yue and Li, Yong-Lu and Huang, Zhemin and Liu, Michael Xu and Lu, Cewu and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {5273-5284},
  doi       = {10.1109/ICCV51070.2023.00486},
  url       = {https://mlanthology.org/iccv/2023/xu2023iccv-egopca/}
}