Semi-Supervised Learning for Multi-Component Data Classification

Abstract

This paper presents a method for designing a semi-supervised classifier for multi-component data such as web pages consisting of text and link information. The proposed method is based on a hybrid of generative and discriminative approaches to take advantage of both approaches. With our hybrid approach, for each component, we consider an individual generative model trained on labeled samples and a model introduced to reduce the effect of the bias that results when there are few labeled samples. Then, we construct a hybrid classifier by combining all the models based on the maximum entropy principle. In our experimental results using three test collections such as web pages and technical papers, we confirmed that our hybrid approach was effective in improving the generalization performance of multi-component data classification.

Cite

Text

Fujino et al. "Semi-Supervised Learning for Multi-Component Data Classification." International Joint Conference on Artificial Intelligence, 2007.

Markdown

[Fujino et al. "Semi-Supervised Learning for Multi-Component Data Classification." International Joint Conference on Artificial Intelligence, 2007.](https://mlanthology.org/ijcai/2007/fujino2007ijcai-semi/)

BibTeX

@inproceedings{fujino2007ijcai-semi,
  title     = {{Semi-Supervised Learning for Multi-Component Data Classification}},
  author    = {Fujino, Akinori and Ueda, Naonori and Saito, Kazumi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {2754-2759},
  url       = {https://mlanthology.org/ijcai/2007/fujino2007ijcai-semi/}
}