Unsupervised Multiple-Instance Learning for Functional Profiling of Genomic Data

Abstract

Multiple-instance learning (MIL) is a popular concept among the AI community to support supervised learning applications in situations where only incomplete knowledge is available. We propose an original reformulation of the MIL concept for the unsupervised context (UMIL), which can serve as a broader framework for clustering data objects adequately described by the multiple-instance representation. Three algorithmic solutions are suggested by derivation from available conventional methods: agglomerative or partition clustering and MIL’s citation-kNN approach. Based on standard clustering quality measures, we evaluated these algorithms within a bioinformatic framework to perform a functional profiling of two genomic data sets, after relating expression data to biological annotations into an UMIL representation. Our analysis spotlighted meaningful interaction patterns relating biological processes and regulatory pathways into coherent functional modules, uncovering profound features of the biological model. These results indicate UMIL’s usefulness in exploring hidden behavioral patterns from complex data.

Cite

Text

Henegar et al. "Unsupervised Multiple-Instance Learning for Functional Profiling of Genomic Data." European Conference on Machine Learning, 2006. doi:10.1007/11871842_21

Markdown

[Henegar et al. "Unsupervised Multiple-Instance Learning for Functional Profiling of Genomic Data." European Conference on Machine Learning, 2006.](https://mlanthology.org/ecmlpkdd/2006/henegar2006ecml-unsupervised/) doi:10.1007/11871842_21

BibTeX

@inproceedings{henegar2006ecml-unsupervised,
  title     = {{Unsupervised Multiple-Instance Learning for Functional Profiling of Genomic Data}},
  author    = {Henegar, Corneliu and Clément, Karine and Zucker, Jean-Daniel},
  booktitle = {European Conference on Machine Learning},
  year      = {2006},
  pages     = {186-197},
  doi       = {10.1007/11871842_21},
  url       = {https://mlanthology.org/ecmlpkdd/2006/henegar2006ecml-unsupervised/}
}