MolSiam: Simple Siamese Self-Supervised Representation Learning for Small Molecules

Abstract

We investigate a self-supervised learning technique from the Simple Siamese (SimSiam) Representation Learning framework on 2D molecule graphs. SimSiam does not require negative samples during training, making it 1) more computationally efficient and 2) less vulnerable to faulty negatives compared with contrastive learning. Leveraging unlabeled molecular data, we demonstrate that our approach, MolSiam, effectively captures the underlying features of molecules and shows that those with similar properties tend to cluster in UMAP analysis. By fine-tuning pre-trained MolSiam models, we observe performance improvements across four downstream therapeutic property prediction tasks without training with negative pairs.

Cite

Text

Lin. "MolSiam: Simple Siamese Self-Supervised Representation Learning for Small Molecules." NeurIPS 2023 Workshops: AI4D3, 2023.

Markdown

[Lin. "MolSiam: Simple Siamese Self-Supervised Representation Learning for Small Molecules." NeurIPS 2023 Workshops: AI4D3, 2023.](https://mlanthology.org/neuripsw/2023/lin2023neuripsw-molsiam/)

BibTeX

@inproceedings{lin2023neuripsw-molsiam,
  title     = {{MolSiam: Simple Siamese Self-Supervised Representation Learning for Small Molecules}},
  author    = {Lin, Joshua Yao-Yu},
  booktitle = {NeurIPS 2023 Workshops: AI4D3},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/lin2023neuripsw-molsiam/}
}