Evaluating Self-Supervised Learned Molecular Graphs
Abstract
Because of data scarcity in real-world scenarios, obtaining pre-trained representations via self-supervised learning (SSL) has attracted increasing interest. Although various methods have been proposed, it is still under-explored what knowledge the networks learn from the pre-training tasks and how it relates to downstream properties. In this work, with an emphasis on chemical molecular graphs, we fill in this gap by devising a range of node-level, pair-level, and graph-level probe tasks to analyse the representations from pre-trained graph neural networks (GNNs). We empirically show that: 1. Pre-trained models have better downstream performance compared to randomly-initialised models due to their improved the capability of capturing global topology and recognising substructures. 2. However, randomly initialised models outperform pre-trained models in terms of retaining local topology. Such information gradually disappears from the early layers to the last layers for pre-trained models.
Cite
Text
Wang et al. "Evaluating Self-Supervised Learned Molecular Graphs." ICML 2022 Workshops: Pre-Training, 2022.Markdown
[Wang et al. "Evaluating Self-Supervised Learned Molecular Graphs." ICML 2022 Workshops: Pre-Training, 2022.](https://mlanthology.org/icmlw/2022/wang2022icmlw-evaluating-a/)BibTeX
@inproceedings{wang2022icmlw-evaluating-a,
title = {{Evaluating Self-Supervised Learned Molecular Graphs}},
author = {Wang, Hanchen and Liu, Shengchao and Kaddour, Jean and Liu, Qi and Tang, Jian and Kusner, Matt and Lasenby, Joan},
booktitle = {ICML 2022 Workshops: Pre-Training},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/wang2022icmlw-evaluating-a/}
}