Deep Interactions for Multimodal Molecular Property Prediction
Abstract
Multi-modal learning by means of leveraging both 2D graph and 3D point cloud information has become a prevalent method to improve model performance in molecular property prediction. However, many recent techniques focus on specific pre-training tasks such as contrastive learning, feature blending, and atom/subgraph masking in order to learn multi-modality even though design of model architecture is also impactful for both pre-training and downstream task performance. Relying on pre-training tasks to align 2D and 3D modalities lacks direct interaction which may be more effective in multimodal learning. In this work, we propose MolInteract, which takes a simple yet effective architecture-focused approach to multimodal molecule learning which addresses these challenges. MolInteract leverages an interaction layer for fusing 2D and 3D information and fostering cross-modal alignment, showing strong results using even the simplest pre-training methods such as predicting features of the 3D point cloud and 2D graph. MolInteract exceeds several current state-of-the-art multimodal pre-training techniques and architectures on various downstream 2D and 3D molecule property prediction benchmark tasks.
Cite
Text
Soga et al. "Deep Interactions for Multimodal Molecular Property Prediction." NeurIPS 2024 Workshops: AIDrugX, 2024.Markdown
[Soga et al. "Deep Interactions for Multimodal Molecular Property Prediction." NeurIPS 2024 Workshops: AIDrugX, 2024.](https://mlanthology.org/neuripsw/2024/soga2024neuripsw-deep/)BibTeX
@inproceedings{soga2024neuripsw-deep,
title = {{Deep Interactions for Multimodal Molecular Property Prediction}},
author = {Soga, Patrick and Lei, Zhenyu and Bilodeau, Camille L. and Li, Jundong},
booktitle = {NeurIPS 2024 Workshops: AIDrugX},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/soga2024neuripsw-deep/}
}