Gradient-Based Gene Selection for Multimodal scRNA-Seq Foundation Models

Abstract

Foundation models have emerged as powerful tools for analyzing single-cell RNA sequencing (scRNA-seq) data. However, selecting informative gene features for both input to the model and analysis in the output remains a critical challenge. Traditional feature selection methods filter on the basis of highly variable genes and analyze them using differential distribution, but they often struggle with scalability and robustness in heterogeneous, high-dimensional datasets. In this study, we explore the limitations of conventional feature selection techniques in the context of a multimodal foundation model and propose alternative gradient-based attribution techniques on learned feature embeddings to improve feature selection. We demonstrate how our selection strategy enhances model performance, overcomes the limitations of traditional approaches, and holds the potential to reveal the inherent polygenicity of diseases.

Cite

Text

Thadawasin et al. "Gradient-Based Gene Selection for Multimodal scRNA-Seq Foundation Models." ICLR 2025 Workshops: MLGenX, 2025.

Markdown

[Thadawasin et al. "Gradient-Based Gene Selection for Multimodal scRNA-Seq Foundation Models." ICLR 2025 Workshops: MLGenX, 2025.](https://mlanthology.org/iclrw/2025/thadawasin2025iclrw-gradientbased/)

BibTeX

@inproceedings{thadawasin2025iclrw-gradientbased,
  title     = {{Gradient-Based Gene Selection for Multimodal scRNA-Seq Foundation Models}},
  author    = {Thadawasin, Pakaphol and Khodaee, Farhan and Zandie, Rohola and Edelman, Elazer R},
  booktitle = {ICLR 2025 Workshops: MLGenX},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/thadawasin2025iclrw-gradientbased/}
}