Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion
Abstract
Existing neural multi-objective combinatorial optimization (MOCO) methods still exhibit an optimality gap since they fail to fully exploit the intrinsic features of problem instances. A significant factor contributing to this shortfall is their reliance solely on graph-modal information. To overcome this, we propose a novel graph-image multimodal fusion (GIMF) framework that enhances neural MOCO methods by integrating graph and image information of the problem instances. Our GIMF framework comprises three key components: (1) a constructed coordinate image to better represent the spatial structure of the problem instance, (2) a problem-size adaptive resolution strategy during the image construction process to improve the cross-size generalization of the model, and (3) a multimodal fusion mechanism with modality-specific bottlenecks to efficiently couple graph and image information. We demonstrate the versatility of our GIMF by implementing it with two state-of-the-art neural MOCO backbones. Experimental results on classic MOCO problems show that our GIMF significantly outperforms state-of-the-art neural MOCO methods and exhibits superior generalization capability.
Cite
Text
Chen et al. "Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion." International Conference on Learning Representations, 2025.Markdown
[Chen et al. "Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/chen2025iclr-neural/)BibTeX
@inproceedings{chen2025iclr-neural,
title = {{Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion}},
author = {Chen, Jinbiao and Wang, Jiahai and Cao, Zhiguang and Wu, Yaoxin},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/chen2025iclr-neural/}
}