Transferring Naive Bayes Classifiers for Text Classification

Abstract

A basic assumption in traditional machine learning is that the training and test data distributions should be identical. This assumption may not hold in many situations in practice, but we may be forced to rely on a different-distribution data to learn a prediction model. For example, this may be the case when it is expensive to label the data in a domain of interest, although in a related but different domain there may be plenty of labeled data available. In this paper, we propose a novel transfer-learning algorithm for text classification based on an EM-based Naive Bayes classifiers. Our solution is to first estimate the initial probabilities under a distribution Dℓ of one labeled data set, and then use an EM algorithm to revise the model for a different distribution Du of the test data which are unlabeled. We show that our algorithm is very effective in several different pairs of domains, where the distances between the different distributions are measured using the Kullback-Leibler (KL) divergence. Moreover, KL-divergence is used to decide the trade-off parameters in our algorithm. In the experiment, our algorithm outperforms the traditional supervised and semi-supervised learning algorithms when the distributions of the training and test sets are increasingly different.

Cite

Text

Dai et al. "Transferring Naive Bayes Classifiers for Text Classification." AAAI Conference on Artificial Intelligence, 2007.

Markdown

[Dai et al. "Transferring Naive Bayes Classifiers for Text Classification." AAAI Conference on Artificial Intelligence, 2007.](https://mlanthology.org/aaai/2007/dai2007aaai-transferring/)

BibTeX

@inproceedings{dai2007aaai-transferring,
  title     = {{Transferring Naive Bayes Classifiers for Text Classification}},
  author    = {Dai, Wenyuan and Xue, Gui-Rong and Yang, Qiang and Yu, Yong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {540-545},
  url       = {https://mlanthology.org/aaai/2007/dai2007aaai-transferring/}
}