GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction

Abstract

We have developed an experimental platform and unified data ontology for collecting and sharing open-access data on diverse protein functions to enable the development of predictive models linking sequence to function. The experimental strategy for data generation employs the growth-based quantitative sequencing (GROQ-seq) platform as a simple yet adaptable system that can be easily expanded to encompass new functions. This high-throughput experimental platform can produce quantitative functional characterization data for hundreds of thousands of proteins per experiment at a cost of approximately $0.05 per sequence. To date, we have made significant progress in collecting data for our initial protein function: transcription factor binding. We are also developing GROQ-seq for a suite of additional protein functions, including proteases, aminoacyl tRNA synthetases, RNA polymerases, histidine kinases, single-chain antibody fragments, and a variety of metabolic enzymes. Being both highly-scalable and extensible to new protein functions, the GROQ-seq platform enables the collection of the diversity of data necessary to create a generalizable model that quantitatively predicts sequence to function relationships.

Cite

Text

Ross et al. "GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction." ICLR 2025 Workshops: GEM, 2025.

Markdown

[Ross et al. "GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/ross2025iclrw-groqseq/)

BibTeX

@inproceedings{ross2025iclrw-groqseq,
  title     = {{GROQ-Seq: A Collaborative, Open Data Approach to Addressing Protein Function Prediction}},
  author    = {Ross, David and Spinner, Aviv and d'Oelsnitz, Simon and Ikonomova, Svetlana P and Vasilyeva, Olga and Alperovich, Nina and Sheldon, Kristen and Tretheway, Courtney and Cortade, Dana and DeBenedictis, Erika and Kelly, Peter J},
  booktitle = {ICLR 2025 Workshops: GEM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/ross2025iclrw-groqseq/}
}