Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks
Abstract
Recent advancements in large foundation models have revealed impressive capabilities in mastering complex chemical language representations. These models undergo a task-agnostic learning phase, characterized by pre-training on extensive unlabeled corpora followed by fine-tuning on specific downstream tasks. This methodology reduces reliance on labeled data, facilitating data acquisition and broadening the scope of chemical language representation. However, real-world scenarios often pose challenges due to domain shift, necessitating robust domain adaptation strategies to maintain performance levels across different contexts. To address this, we present a novel causal-based framework for feature selection and domain adaptation to enhance the performance of chemical foundation models on downstream tasks. Our approach employs a multi-stage feature selection method that identifies physico-chemical features based on their direct causal-effect over specific downstream properties. By employing Mordred descriptors and Markov blanket causal graphs, our approach provides insight into the causal relationships between features and target properties for prediction tasks. We evaluate our approach on various foundation model architectures and datasets, demonstrating consistent performance improvements, which showcases the robustness and the agnostic nature of our approach.
Cite
Text
Shirasuna et al. "Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks." NeurIPS 2024 Workshops: FM4Science, 2024.Markdown
[Shirasuna et al. "Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks." NeurIPS 2024 Workshops: FM4Science, 2024.](https://mlanthology.org/neuripsw/2024/shirasuna2024neuripsw-agnostic/)BibTeX
@inproceedings{shirasuna2024neuripsw-agnostic,
title = {{Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks}},
author = {Shirasuna, Victor Yukio and Soares, Eduardo and Brazil, Emilio Vital and Gutierrez, Karen Fiorella Aquino and Cerqueira, Renato and Zubarev, Dmitry and Schmidt, Kristin},
booktitle = {NeurIPS 2024 Workshops: FM4Science},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/shirasuna2024neuripsw-agnostic/}
}