Dual-S3D: Hierarchical Dual-Path Selective SSM-CNN for High-Fidelity Implicit Reconstruction

Abstract

Single-view 3D reconstruction aims to recover the complete 3D geometry and appearance of objects from a single RGB image. Due to incomplete image information and ambiguity, this task remains challenging. Existing methods struggle with the trade-off between local detail and global topology, and with interference from early RGB-depth fusion in signed distance function optimization. To address these challenges, we propose Dual-S3D, a novel framework for single-view 3D reconstruction. Our method employs a hierarchical dual-path feature extraction strategy based on stages that utilize convolutional neural networks to anchor local geometric details. In contrast, subsequent stages leverage a Transformer integrated with selective state-space model to capture global topology, enhancing scene understanding and feature representation. Additionally, we design an auxiliary branch that progressively fuses precomputed depth features with pixel-level features to decouple visual and geometric cues effectively. Extensive experiments on the 3D-FRONT and Pix3D datasets demonstrate that our approach significantly outperforms existing methods--reducing chamfer distance by 51%, increasing F-score by 33.6%, and improving normal consistency by 10.3%--thus achieving state-of-the-art reconstruction quality.

Cite

Text

Zhang et al. "Dual-S3D: Hierarchical Dual-Path Selective SSM-CNN for High-Fidelity Implicit Reconstruction." International Conference on Computer Vision, 2025.

Markdown

[Zhang et al. "Dual-S3D: Hierarchical Dual-Path Selective SSM-CNN for High-Fidelity Implicit Reconstruction." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/zhang2025iccv-duals3d/)

BibTeX

@inproceedings{zhang2025iccv-duals3d,
  title     = {{Dual-S3D: Hierarchical Dual-Path Selective SSM-CNN for High-Fidelity Implicit Reconstruction}},
  author    = {Zhang, Luoxi and Shrestha, Pragyan and Zhou, Yu and Xie, Chun and Kitahara, Itaru},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {25104-25113},
  url       = {https://mlanthology.org/iccv/2025/zhang2025iccv-duals3d/}
}