Explainable Multiple Instance Learning with Instance Selection Randomized Trees

Abstract

Multiple Instance Learning (MIL) aims at extracting patterns from a collection of samples, where individual samples (called bags) are represented by a group of multiple feature vectors (called instances) instead of a single feature vector. Grouping instances into bags not only helps to formulate some learning problems more naturally, it also significantly reduces label acquisition costs as only the labels for bags are needed, not for the inner instances. However, in application domains where inference transparency is demanded, such as in network security, the sample attribution requirements are often asymmetric with respect to the training/application phase. While in the training phase it is very convenient to supply labels only for bags, in the application phase it is generally not enough to just provide decisions on the bag-level because the inferred verdicts need to be explained on the level of individual instances. Unfortunately, the majority of recent MIL classifiers does not focus on this real-world need. In this paper, we address this problem and propose a new tree-based MIL classifier able to identify instances responsible for positive bag predictions. Results from an empirical evaluation on a large-scale network security dataset also show that the classifier achieves superior performance when compared with prior art methods.

Cite

Text

Komárek et al. "Explainable Multiple Instance Learning with Instance Selection Randomized Trees." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86520-7_44

Markdown

[Komárek et al. "Explainable Multiple Instance Learning with Instance Selection Randomized Trees." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/komarek2021ecmlpkdd-explainable/) doi:10.1007/978-3-030-86520-7_44

BibTeX

@inproceedings{komarek2021ecmlpkdd-explainable,
  title     = {{Explainable Multiple Instance Learning with Instance Selection Randomized Trees}},
  author    = {Komárek, Tomás and Brabec, Jan and Somol, Petr},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2021},
  pages     = {715-730},
  doi       = {10.1007/978-3-030-86520-7_44},
  url       = {https://mlanthology.org/ecmlpkdd/2021/komarek2021ecmlpkdd-explainable/}
}