Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery

Abstract

Accurate 3D reconstruction of hands and instruments is critical for vision-based analysis of ophthalmic microsurgery, yet progress has been hampered by the lack of realistic, large-scale datasets and reliable annotation tools. In this work, we introduce OphNet-3D, the first extensive RGB-D dynamic 3D reconstruction dataset for ophthalmic surgery, comprising 41 sequences from 40 surgeons and totaling 7.1 million frames, with fine-grained annotations of 12 surgical phases, 10 instrument categories, dense MANO hand meshes, and full 6-DoF instrument poses. To scalably produce high-fidelity labels, we design a multi-stage automatic annotation pipeline that integrates multi-view data observation, data-driven motion prior with cross-view geometric consistency and biomechanical constraints, along with a combination of collision-aware interaction constraints for instrument interactions. Building upon OphNet-3D, we establish two challenging benchmarks—bimanual hand pose estimation and hand–instrument interaction reconstruction—and propose two dedicated architectures: H-Net for dual-hand mesh recovery and OH-Net for joint reconstruction of two-hand–two-instrument interactions. These models leverage a novel spatial reasoning module with weak-perspective camera modeling and collision-aware center-based representation. Both architectures outperform existing methods by substantial margins, achieving improvements of over 2mm in Mean Per Joint Position Error (MPJPE) and up to 23\% in ADD-S metrics for hand and instrument reconstruction, respectively.

Cite

Text

Hu et al. "Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery." Advances in Neural Information Processing Systems, 2025.

Markdown

[Hu et al. "Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/hu2025neurips-dynamic/)

BibTeX

@inproceedings{hu2025neurips-dynamic,
  title     = {{Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery}},
  author    = {Hu, Ming and Yu, Zhengdi and Tang, Feilong and Chen, Kaiwen and Li, Yulong and Razzak, Imran and He, Junjun and Birdal, Tolga and Zhou, Kaijing and Ge, Zongyuan},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/hu2025neurips-dynamic/}
}