Synthetic Oversampling of Multi-Label Data Based on Local Label Distribution

Abstract

Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods. One efficient strategy to deal with this problem is to employ resampling techniques before training the classifier. Existing multilabel sampling methods alleviate the (global) imbalance of multi-label datasets. However, performance degradation is mainly due to rare subconcepts and overlapping of classes that could be analysed by looking at the local characteristics of the minority examples, rather than the imbalance of the whole dataset. We propose a new method for synthetic oversampling of multi-label data that focuses on local label distribution to generate more diverse and better labeled instances. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed approach in a variety of evaluation measures, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data.

Cite

Text

Liu and Tsoumakas. "Synthetic Oversampling of Multi-Label Data Based on Local Label Distribution." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019. doi:10.1007/978-3-030-46147-8_11

Markdown

[Liu and Tsoumakas. "Synthetic Oversampling of Multi-Label Data Based on Local Label Distribution." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019.](https://mlanthology.org/ecmlpkdd/2019/liu2019ecmlpkdd-synthetic/) doi:10.1007/978-3-030-46147-8_11

BibTeX

@inproceedings{liu2019ecmlpkdd-synthetic,
  title     = {{Synthetic Oversampling of Multi-Label Data Based on Local Label Distribution}},
  author    = {Liu, Bin and Tsoumakas, Grigorios},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2019},
  pages     = {180-193},
  doi       = {10.1007/978-3-030-46147-8_11},
  url       = {https://mlanthology.org/ecmlpkdd/2019/liu2019ecmlpkdd-synthetic/}
}