Phenotype Inference from Text and Genomic Data
Abstract
We describe ProTraits, a machine learning pipeline that systematically annotates microbes with phenotypes using a large amount of textual data from scientific literature and other online resources, as well as genome sequencing data. Moreover, by relying on a multi-view non-negative matrix factorization approach, ProTraits pipeline is also able to discover novel phenotypic concepts from unstructured text. We present the main components of the developed pipeline and outline challenges for the application to other fields.
Cite
Text
Brbic et al. "Phenotype Inference from Text and Genomic Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017. doi:10.1007/978-3-319-71273-4_34Markdown
[Brbic et al. "Phenotype Inference from Text and Genomic Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017.](https://mlanthology.org/ecmlpkdd/2017/brbic2017ecmlpkdd-phenotype/) doi:10.1007/978-3-319-71273-4_34BibTeX
@inproceedings{brbic2017ecmlpkdd-phenotype,
title = {{Phenotype Inference from Text and Genomic Data}},
author = {Brbic, Maria and Piskorec, Matija and Vidulin, Vedrana and Krisko, Anita and Smuc, Tomislav and Supek, Fran},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2017},
pages = {373-377},
doi = {10.1007/978-3-319-71273-4_34},
url = {https://mlanthology.org/ecmlpkdd/2017/brbic2017ecmlpkdd-phenotype/}
}