Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data
Abstract
Single-cell RNA sequencing (scRNA-seq) provides vast amounts of gene expression data. In this paper, we benchmark several graph neural network (GNN) approaches for cell-type classification using imputed single-cell gene expression data. We model the data in the Paul15 dataset, describing the development of myeloid progenitors, as a bipartite graph consisting of cell and gene nodes, with edge values signifying gene expression. We train a 3-layer GraphSage GNN to impute data by training it to reconstruct the dataset based on a downstream cell classification task. For this, we use a cell-cell graph representation on a small graph convolutional network (GCN) and an adjacency matrix predetermined by spectral clustering. When combined with the data imputation model, GNN classification performance is 58\%, marginally worse than an SVM benchmark of 59.4\%, however exhibits better learning and generalisation characteristics along with producing an auxiliary imputation model. Our findings catalyse the development of new tools to analyse complex single-cell datasets.
Cite
Text
Li et al. "Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data." NeurIPS 2022 Workshops: LMRL, 2022.Markdown
[Li et al. "Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data." NeurIPS 2022 Workshops: LMRL, 2022.](https://mlanthology.org/neuripsw/2022/li2022neuripsw-benchmarking/)BibTeX
@inproceedings{li2022neuripsw-benchmarking,
title = {{Benchmarking Graph Neural Network-Based Imputation Methods on Single-Cell Transcriptomics Data}},
author = {Li, Han-Bo and Torné, Ramon Viñas and Lio, Pietro},
booktitle = {NeurIPS 2022 Workshops: LMRL},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/li2022neuripsw-benchmarking/}
}