Phenotype Inference from Text and Genomic Data

Abstract

We describe ProTraits, a machine learning pipeline that systematically annotates microbes with phenotypes using a large amount of textual data from scientific literature and other online resources, as well as genome sequencing data. Moreover, by relying on a multi-view non-negative matrix factorization approach, ProTraits pipeline is also able to discover novel phenotypic concepts from unstructured text. We present the main components of the developed pipeline and outline challenges for the application to other fields.

Cite

Text

Brbic et al. "Phenotype Inference from Text and Genomic Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017. doi:10.1007/978-3-319-71273-4_34

Markdown

[Brbic et al. "Phenotype Inference from Text and Genomic Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017.](https://mlanthology.org/ecmlpkdd/2017/brbic2017ecmlpkdd-phenotype/) doi:10.1007/978-3-319-71273-4_34

BibTeX

@inproceedings{brbic2017ecmlpkdd-phenotype,
  title     = {{Phenotype Inference from Text and Genomic Data}},
  author    = {Brbic, Maria and Piskorec, Matija and Vidulin, Vedrana and Krisko, Anita and Smuc, Tomislav and Supek, Fran},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2017},
  pages     = {373-377},
  doi       = {10.1007/978-3-319-71273-4_34},
  url       = {https://mlanthology.org/ecmlpkdd/2017/brbic2017ecmlpkdd-phenotype/}
}