Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data

Abstract

Single-cell RNA sequencing (scRNA-seq) provides vast amounts of gene expression data. In this paper, we benchmark several graph neural network (GNN) approaches for cell-type classification using imputed single-cell gene expression data. We model the data in the Paul15 dataset, describing the development of myeloid progenitors, as a bipartite graph consisting of cell and gene nodes, with edge values signifying gene expression. We train a 3-layer GraphSage GNN to impute data by training it to reconstruct the dataset based on a downstream cell classification task. For this, we use a cell-cell graph representation on a small graph convolutional network (GCN) and an adjacency matrix predetermined by spectral clustering. When combined with the data imputation model, GNN classification performance is 58\%, marginally worse than an SVM benchmark of 59.4\%, however exhibits better learning and generalisation characteristics along with producing an auxiliary imputation model. Our findings catalyse the development of new tools to analyse complex single-cell datasets.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Li et al. "Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data." NeurIPS 2022 Workshops: LMRL, 2022.

Markdown

[Li et al. "Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data." NeurIPS 2022 Workshops: LMRL, 2022.](https://mlanthology.org/neuripsw/2022/li2022neuripsw-benchmarking/)

BibTeX

@inproceedings{li2022neuripsw-benchmarking,
  title     = {{Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data}},
  author    = {Li, Han-Bo and Torné, Ramon Viñas and Lio, Pietro},
  booktitle = {NeurIPS 2022 Workshops: LMRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/li2022neuripsw-benchmarking/}
}