DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening

Abstract

Virtual screening, which identifies potential drugs from vast compound databases to bind with a particular protein pocket, is a critical step in AI-assisted drug discovery. Traditional docking methods are highly time-consuming, and can only work with a restricted search library in real-life applications. Recent supervised learning approaches using scoring functions for binding-affinity prediction, although promising, have not yet surpassed docking methods due to their strong dependency on limited data with reliable binding-affinity labels. In this paper, we propose a novel contrastive learning framework, DrugCLIP, by reformulating virtual screening as a dense retrieval task and employing contrastive learning to align representations of binding protein pockets and molecules from a large quantity of pairwise data without explicit binding-affinity scores. We also introduce a biological-knowledge inspired data augmentation strategy to learn better protein-molecule representations. Extensive experiments show that DrugCLIP significantly outperforms traditional docking and supervised learning methods on diverse virtual screening benchmarks with highly reduced computation time, especially in zero-shot setting.

Cite

Text

Gao et al. "DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening." Neural Information Processing Systems, 2023.

Markdown

[Gao et al. "DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/gao2023neurips-drugclip/)

BibTeX

@inproceedings{gao2023neurips-drugclip,
  title     = {{DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening}},
  author    = {Gao, Bowen and Qiang, Bo and Tan, Haichuan and Jia, Yinjun and Ren, Minsi and Lu, Minsi and Liu, Jingjing and Ma, Wei-Ying and Lan, Yanyan},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/gao2023neurips-drugclip/}
}