EvoFlow-RNA: Generating and Representing Non-Coding RNA with a Language Model
Abstract
RNA plays a critical role across numerous biological functions. Recent advances in language modeling show promise with representing RNA, but the possibility of large-scale RNA design and optimization has yet to be explored. We propose \textbf{EvoFlow-RNA}, a bidirectional RNA language model leveraging masked discrete diffusion models (MDMs) for both generative modeling and representation learning. EvoFlow-RNA bridges the gap between RNA sequence representation and design. It outperforms leading RNA models on six BEACON tasks, excelling in secondary structure prediction. For unconditional generation, it synthesizes diverse RNA sequences with native-like biophysical properties. Furthermore, EvoFlow-RNA can optimize aptamer sequences while preserving binding recognition sites. Our results demonstrate EvoFlow-RNA’s effectiveness in RNA modeling, highlighting the capability and potential of masked discrete diffusion for RNA design. Our code is available at https://github.com/AtomBio/evoflow-rna.
Cite
Text
Patel et al. "EvoFlow-RNA: Generating and Representing Non-Coding RNA with a Language Model." ICLR 2025 Workshops: AI4NA, 2025.Markdown
[Patel et al. "EvoFlow-RNA: Generating and Representing Non-Coding RNA with a Language Model." ICLR 2025 Workshops: AI4NA, 2025.](https://mlanthology.org/iclrw/2025/patel2025iclrw-evoflowrna/)BibTeX
@inproceedings{patel2025iclrw-evoflowrna,
title = {{EvoFlow-RNA: Generating and Representing Non-Coding RNA with a Language Model}},
author = {Patel, Sawan and Peng, Fred Zhangzhi and Fraser, Keith and Friedman, Adam David and Chatterjee, Pranam and Yao, Sherwood},
booktitle = {ICLR 2025 Workshops: AI4NA},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/patel2025iclrw-evoflowrna/}
}