On the Stratification of Multi-Label Data

Sechidis, Konstantinos; Tsoumakas, Grigorios; Vlahavas, Ioannis P.

doi:10.1007/978-3-642-23808-6_10

On the Stratification of Multi-Label Data

Konstantinos Sechidis, Grigorios Tsoumakas, Ioannis P. Vlahavas

ECML-PKDD 2011 pp. 145-158

doi:10.1007/978-3-642-23808-6_10 /ecmlpkdd/2011/sechidis2011ecmlpkdd-stratification/

Abstract

Stratified sampling is a sampling method that takes into account the existence of disjoint groups within a population and produces samples where the proportion of these groups is maintained. In single-label classification tasks, groups are differentiated based on the value of the target variable. In multi-label learning tasks, however, where there are multiple target variables, it is not clear how stratified sampling could/should be performed. This paper investigates stratification in the multi-label data context. It considers two stratification methods for multi-label data and empirically compares them along with random sampling on a number of datasets and based on a number of evaluation criteria. The results reveal some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Sechidis et al. "On the Stratification of Multi-Label Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011. doi:10.1007/978-3-642-23808-6_10

Markdown

[Sechidis et al. "On the Stratification of Multi-Label Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2011.](https://mlanthology.org/ecmlpkdd/2011/sechidis2011ecmlpkdd-stratification/) doi:10.1007/978-3-642-23808-6_10

BibTeX

@inproceedings{sechidis2011ecmlpkdd-stratification,
  title     = {{On the Stratification of Multi-Label Data}},
  author    = {Sechidis, Konstantinos and Tsoumakas, Grigorios and Vlahavas, Ioannis P.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2011},
  pages     = {145-158},
  doi       = {10.1007/978-3-642-23808-6_10},
  url       = {https://mlanthology.org/ecmlpkdd/2011/sechidis2011ecmlpkdd-stratification/}
}