Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval

Abstract

Predicting molecular impact on cellular function is a core challenge in therapeutic design. Phenomic experiments, designed to capture cellular morphology, utilize microscopy based techniques and demonstrate a high throughput solution for uncovering molecular impact on the cell. In this work, we learn a joint latent space between molecular structures and microscopy phenomic experiments, aligning paired samples with contrastive learning. Specifically, we study the problem of Contrastive PhenoMolecular Retrieval, which consists of zero-shot molecular structure identification conditioned on phenomic experiments. We assess challenges in multi-modal learning of phenomics and molecular modalities such as experimental batch effect, inactive molecule perturbations, and encoding perturbation concentration. We demonstrate improved multi-modal learner retrieval through (1) a uni-modal pre-trained phenomics model, (2) a novel inter sample similarity aware loss, and (3) models conditioned on a representation of molecular concentration. Following this recipe, we propose MolPhenix, a molecular phenomics model. MolPhenix leverages a pre-trained phenomics model to demonstrate significant performance gains across perturbation concentrations, molecular scaffolds, and activity thresholds. In particular, we demonstrate an 8.1$\times$ improvement in zero shot molecular retrieval of active molecules over the previous state-of-the-art, reaching 77.33% in top-1% accuracy. These results open the door for machine learning to be applied in virtual phenomics screening, which can significantly benefit drug discovery applications.

Cite

Text

Fradkin et al. "Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval." NeurIPS 2024 Workshops: AIDrugX, 2024.

Markdown

[Fradkin et al. "Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval." NeurIPS 2024 Workshops: AIDrugX, 2024.](https://mlanthology.org/neuripsw/2024/fradkin2024neuripsw-molphenix/)

BibTeX

@inproceedings{fradkin2024neuripsw-molphenix,
  title     = {{Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval}},
  author    = {Fradkin, Philip and Moghadam, Puria Azadi and Suri, Karush and Wenkel, Frederik and Sypetkowski, Maciej and Beaini, Dominique},
  booktitle = {NeurIPS 2024 Workshops: AIDrugX},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/fradkin2024neuripsw-molphenix/}
}