Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective

Abstract

Existing efforts to boost multimodal fusion of 3D anomaly detection (3D-AD) primarily concentrate on devising more effective multimodal fusion strategies. However, little attention was devoted to analyzing the role of multimodal fusion architecture (topology) design in contributing to 3D-AD. In this paper, we aim to bridge this gap and present a systematic study on the impact of multimodal fusion architecture design on 3D-AD. This work considers the multimodal fusion architecture design at the intra-module fusion level, i.e., independent modality-specific modules, involving early, middle or late multimodal features with specific fusion operations, and also at the inter-module fusion level, i.e., the strategies to fuse those modules. In both cases, we first derive insights through theoretically and experimentally exploring how architectural designs influence 3D-AD. Then, we extend SOTA neural architecture search (NAS) paradigm and propose 3D-ADNAS to simultaneously search across multimodal fusion strategies and modality-specific modules for the first time. Extensive experiments show that 3D-ADNAS obtains consistent improvements in 3D-AD across various model capacities in terms of accuracy, frame rate, and memory usage, and it exhibits great potential in dealing with few-shot 3D-AD tasks.

Cite

Text

Long et al. "Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I12.33337

Markdown

[Long et al. "Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/long2025aaai-revisiting/) doi:10.1609/AAAI.V39I12.33337

BibTeX

@inproceedings{long2025aaai-revisiting,
  title     = {{Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective}},
  author    = {Long, Kaifang and Xie, Guoyang and Ma, Lianbo and Liu, Jiaqi and Lu, Zhichao},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {12273-12281},
  doi       = {10.1609/AAAI.V39I12.33337},
  url       = {https://mlanthology.org/aaai/2025/long2025aaai-revisiting/}
}