Generative Model for Pseudomonad Genomes

Abstract

Recent advances in genomic sequencing have resulted in several thousands of full genomes of pseudomonads, a genera of bacteria important in many science areas ranging from biogeochemical cycling in the environment to bacterial pneumonia in humans. With these high-quality data sets, combined with tens of thousands of somewhat lower quality metagenomically assembled genomes, we create a generative model for pseudomonad genomes. We present a GAN model that generates gene family presence absence lists as a representation of a novel genome. We also demonstrate that the discriminator of this model can be used as a binary classifier to identify incorrect genomes with missing content. In the future, our desired model can be used to generate genomes within a given set of parameters such as, “Generate a genome that is root associated, drought resistant, salt tolerant that will produce this natural product”.

Cite

Text

Kesapragada et al. "Generative Model for Pseudomonad Genomes." NeurIPS 2022 Workshops: LMRL, 2022.

Markdown

[Kesapragada et al. "Generative Model for Pseudomonad Genomes." NeurIPS 2022 Workshops: LMRL, 2022.](https://mlanthology.org/neuripsw/2022/kesapragada2022neuripsw-generative/)

BibTeX

@inproceedings{kesapragada2022neuripsw-generative,
  title     = {{Generative Model for Pseudomonad Genomes}},
  author    = {Kesapragada, Manasa and Canon, R Shane and Jungbluth, Sean P and Joachimiak, Marcin P and Arkin, Adam P and Dehal, Paramvir S},
  booktitle = {NeurIPS 2022 Workshops: LMRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/kesapragada2022neuripsw-generative/}
}