UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation
Abstract
Modality or domain distribution shifts pose formidable challenges in 3D semantic segmentation. Existing methods predominantly address either cross-modal or cross-domain adaptation in isolation, leading to insufficient exploration of semantic associations and complementary features in heterogeneous data. To bridge this gap, we present UniDxMD, a unified representation method for cross-modal unsupervised domain adaptation (UDA) in 3D semantic segmentation that simultaneously tackles both cross-modal and cross-domain adaptation objectives. Our core insight is deriving a unified discrete representation from heterogeneous data to mitigate distribution shifts, inspired by vector quantization. Specifically, we propose a differentiable, cluster-based soft quantization mechanism (CSQM) that maps heterogeneous data (spanning modalities and domains) into a shared discrete latent space. Then, we introduce latent space regularization (LSR), leveraging joint prototypes that satisfy semantic relation consistency as learnable anchors to enhance the compactness and semantic discriminability of the discrete latent space. Our method paves the way for advancing cross-modal UDA in 3D semantic segmentation towards the unified representation. Extensive results across four challenging cross-modal UDA scenarios demonstrate the superiority of our method.
Cite
Text
Liang et al. "UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation." International Conference on Computer Vision, 2025.Markdown
[Liang et al. "UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/liang2025iccv-unidxmd/)BibTeX
@inproceedings{liang2025iccv-unidxmd,
title = {{UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation}},
author = {Liang, Zhengyin and Yin, Hui and Liang, Min and Du, Qianqian and Yang, Ying and Huang, Hua},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {20346-20356},
url = {https://mlanthology.org/iccv/2025/liang2025iccv-unidxmd/}
}