Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space

Abstract

Data stream classification poses many challenges, most of which are not addressed by the state-of-the-art. We present DXMiner, which addresses four major challenges to data stream classification, namely, infinite length, concept-drift, concept-evolution, and feature-evolution. Data streams are assumed to be infinite in length, which necessitates single-pass incremental learning techniques. Concept-drift occurs in a data stream when the underlying concept changes over time. Most existing data stream classification techniques address only the infinite length and concept-drift problems. However, concept-evolution and feature- evolution are also major challenges, and these are ignored by most of the existing approaches. Concept-evolution occurs in the stream when novel classes arrive, and feature-evolution occurs when new features emerge in the stream. Our previous work addresses the concept-evolution problem in addition to addressing the infinite length and concept-drift problems. Most of the existing data stream classification techniques, including our previous work, assume that the feature space of the data points in the stream is static. This assumption may be impractical for some type of data, for example text data. DXMiner considers the dynamic nature of the feature space and provides an elegant solution for classification and novel class detection when the feature space is dynamic. We show that our approach outperforms state-of-the-art stream classification techniques in classifying and detecting novel classes in real data streams.

Cite

Text

Masud et al. "Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15883-4_22

Markdown

[Masud et al. "Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/masud2010ecmlpkdd-classification/) doi:10.1007/978-3-642-15883-4_22

BibTeX

@inproceedings{masud2010ecmlpkdd-classification,
  title     = {{Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space}},
  author    = {Masud, Mohammad M. and Chen, Qing and Gao, Jing and Khan, Latifur and Han, Jiawei and Thuraisingham, Bhavani},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2010},
  pages     = {337-352},
  doi       = {10.1007/978-3-642-15883-4_22},
  url       = {https://mlanthology.org/ecmlpkdd/2010/masud2010ecmlpkdd-classification/}
}