GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction
Abstract
We have developed an experimental platform and unified data ontology for collecting and sharing open-access data on diverse protein functions to enable the development of predictive models linking sequence to function. The experimental strategy for data generation employs the growth-based quantitative sequencing (GROQ-seq) platform as a simple yet adaptable system that can be easily expanded to encompass new functions. This high-throughput experimental platform can produce quantitative functional characterization data for hundreds of thousands of proteins per experiment at a cost of approximately $0.05 per sequence. To date, we have made significant progress in collecting data for our initial protein function: transcription factor binding. We are also developing GROQ-seq for a suite of additional protein functions, including proteases, aminoacyl tRNA synthetases, RNA polymerases, histidine kinases, single-chain antibody fragments, and a variety of metabolic enzymes. Being both highly-scalable and extensible to new protein functions, the GROQ-seq platform enables the collection of the diversity of data necessary to create a generalizable model that quantitatively predicts sequence to function relationships.
Cite
Text
Ross et al. "GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction." ICLR 2025 Workshops: GEM, 2025.Markdown
[Ross et al. "GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/ross2025iclrw-groqseq/)BibTeX
@inproceedings{ross2025iclrw-groqseq,
title = {{GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction}},
author = {Ross, David and Spinner, Aviv and d'Oelsnitz, Simon and Ikonomova, Svetlana P and Vasilyeva, Olga and Alperovich, Nina and Sheldon, Kristen and Tretheway, Courtney and Cortade, Dana and DeBenedictis, Erika and Kelly, Peter J},
booktitle = {ICLR 2025 Workshops: GEM},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/ross2025iclrw-groqseq/}
}