CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement
Abstract
To accelerate biochemical research, e.g., drug and protein discovery, molecular representation learning (MRL) has attracted much attention. However, most existing methods follow the closed-set assumption that training and testing data share identical distribution, which limits their generalization abilities in out-of-distribution (OOD) cases. In this paper, we explore designing a new disentangled mechanism for learning generalized molecular representation that exhibits robustness against distribution shifts. And an approach of Concept-Enhanced Feedback Disentanglement (CFD) is proposed, whose goal is to exploit the feedback mechanism to learn distribution-agnostic representation. Specifically, we first propose two dedicated variational encoders to separately decompose distribution-agnostic and spurious features. Then, a set of molecule-aware concepts are tapped to focus on invariant substructure characteristics. By fusing these concepts into the disentangled distribution-agnostic features, the generalization ability of the learned molecular representation could be further enhanced. Next, we execute iteratively the disentangled operations based on a feedback received from the previous output. Finally, based on the outputs of multiple feedback iterations, we construct a self-supervised objective to promote the variational encoders to possess the disentangled capability. In the experiments, our method is verified on multiple real-world molecular datasets. The significant performance gains over state-of-the-art baselines demonstrate that our method can effectively disentangle generalized molecular representation in the presence of various distribution shifts. The source code will be released at https://github.com/AmingWu/MoleculeCFD.
Cite
Text
Wu and Deng. "CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement." International Conference on Learning Representations, 2025.Markdown
[Wu and Deng. "CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/wu2025iclr-cfd/)BibTeX
@inproceedings{wu2025iclr-cfd,
title = {{CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement}},
author = {Wu, Aming and Deng, Cheng},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/wu2025iclr-cfd/}
}