Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint
Abstract
Multinomial Naive Bayes with Expectation Maximization (MNB-EM) is a standard semi-supervised learning method to augment Multinomial Naive Bayes (MNB) for text classification. Despite its success, MNB-EM is not stable, and may succeed or fail to improve MNB. We believe that this is because MNB-EM lacks the ability to preserve the class distribution on words. In this paper, we propose a novel method to augment MNB-EM by leveraging the word-level statistical constraint to preserve the class distribution on words. The word-level statistical constraints are further converted to constraints on document posteriors generated by MNB-EM. Experiments demonstrate that our method can consistently improve MNB-EM, and outperforms state-of-art baselines remarkably.
Cite
Text
Zhao et al. "Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint." AAAI Conference on Artificial Intelligence, 2016. doi:10.1609/AAAI.V30I1.10345Markdown
[Zhao et al. "Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint." AAAI Conference on Artificial Intelligence, 2016.](https://mlanthology.org/aaai/2016/zhao2016aaai-semi/) doi:10.1609/AAAI.V30I1.10345BibTeX
@inproceedings{zhao2016aaai-semi,
title = {{Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint}},
author = {Zhao, Li and Huang, Minlie and Yao, Ziyu and Su, Rongwei and Jiang, Yingying and Zhu, Xiaoyan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2016},
pages = {2877-2884},
doi = {10.1609/AAAI.V30I1.10345},
url = {https://mlanthology.org/aaai/2016/zhao2016aaai-semi/}
}