RNAGym: Benchmarks for RNA Fitness and Structure Prediction
Abstract
Predicting the structure and the effects of mutations in RNA are pivotal for numerous biological and medical applications. However, the evaluation of machine learning-based RNA models has been hampered by disparate and limited experimental datasets, along with inconsistent model performances across different RNA types. To address these limitations, we introduce RNAGym, a comprehensive and large-scale benchmark specifically tailored for RNA fitness and structure prediction. This benchmark suite includes over 30 standardized deep mutational scanning assays, covering hundreds of thousands of mutations, and curated RNA structure datasets. We have developed a robust evaluation framework that integrates multiple metrics suitable for both predictive tasks while accounting for the inherent limitations of experimental methods. RNAGym is designed to facilitate a systematic comparison of RNA models, offering an essential resource to enhance the development and understanding of these models within the computational biology community.
Cite
Text
Arora et al. "RNAGym: Benchmarks for RNA Fitness and Structure Prediction." ICLR 2025 Workshops: GEM, 2025.Markdown
[Arora et al. "RNAGym: Benchmarks for RNA Fitness and Structure Prediction." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/arora2025iclrw-rnagym-a/)BibTeX
@inproceedings{arora2025iclrw-rnagym-a,
title = {{RNAGym: Benchmarks for RNA Fitness and Structure Prediction}},
author = {Arora, Rohit and Angelo, Murphy and Choe, Christian Andrew and Kollasch, Aaron W and Qu, Fiona and Shearer, Courtney A. and Weitzman, Ruben and Gazizov, Artem and Gurev, Sarah and Xie, Erik and Marks, Debora Susan and Notin, Pascal},
booktitle = {ICLR 2025 Workshops: GEM},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/arora2025iclrw-rnagym-a/}
}