CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Abstract
While accurate and user-friendly Computer-Aided Design (CAD) is crucial for industrial design and manufacturing, existing methods still struggle to achieve this due to their over-simplified representations or architectures incapable of supporting multimodal design requirements. In this paper, we attempt to tackle this problem from both methods and datasets aspects. First, we propose a cascade MAR with topology predictor (CMT), the first multimodal framework for CAD generation based on Boundary Representation (B-Rep). Specifically, the cascade MAR can effectively capture the "edge-counters-surface" priors that are essential in B-Reps, while the topology predictor directly estimates topology in B-Reps from the compact tokens in MAR. Second, to facilitate large-scale training, we develop a large-scale multimodal CAD dataset, mmABC, which includes over 1.3 million B-Rep models with multimodal annotations, including point clouds, text descriptions, and multi-view images. Extensive experiments show the superior of CMT in both conditional and unconditional CAD generation tasks. For example, we improve Coverage and Valid ratio by +10.68% and +10.3%, respectively, compared to state-of-the-art methods on ABC in unconditional generation. CMT also improves +4.01 Chamfer on image conditioned CAD generation on mmABC.
Cite
Text
Wu et al. "CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation." International Conference on Computer Vision, 2025.Markdown
[Wu et al. "CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/wu2025iccv-cmt/)BibTeX
@inproceedings{wu2025iccv-cmt,
title = {{CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation}},
author = {Wu, Jianyu and Wang, Yizhou and Yue, Xiangyu and Ma, Xinzhu and Guo, Jinyang and Zhou, Dongzhan and Ouyang, Wanli and Tang, Shixiang},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {7014-7024},
url = {https://mlanthology.org/iccv/2025/wu2025iccv-cmt/}
}