Weakly-Supervised Text Classification with Wasserstein Barycenters Regularization
Abstract
Weakly-supervised text classification aims to train predictive models with unlabeled texts and a few representative words of classes, referred to as category words, rather than labeled texts. These weak supervisions are much more cheaper and easy to collect in real-world scenarios. To resolve this task, we propose a novel deep classification model, namely Weakly-supervised Text Classification with Wasserstein Barycenter Regularization (WTC-WBR). Specifically, we initialize the pseudo-labels of texts by using the category word occurrences, and formulate a weakly self-training framework to iteratively update the weakly-supervised targets by combining the pseudo-labels with the sharpened predictions. Most importantly, we suggest a Wasserstein barycenter regularization with the weakly-supervised targets on the deep feature space. The intuition is that the texts tend to be close to the corresponding Wasserstein barycenter indicated by weakly-supervised targets. Another benefit is that the regularization can capture the geometric information of deep feature space to boost the discriminative power of deep features. Experimental results demonstrate that WTC-WBR outperforms the existing weakly-supervised baselines, and achieves comparable performance to semi-supervised and supervised baselines.
Cite
Text
Ouyang et al. "Weakly-Supervised Text Classification with Wasserstein Barycenters Regularization." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/468Markdown
[Ouyang et al. "Weakly-Supervised Text Classification with Wasserstein Barycenters Regularization." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/ouyang2022ijcai-weakly/) doi:10.24963/IJCAI.2022/468BibTeX
@inproceedings{ouyang2022ijcai-weakly,
title = {{Weakly-Supervised Text Classification with Wasserstein Barycenters Regularization}},
author = {Ouyang, Jihong and Wang, Yiming and Li, Ximing and Li, Changchun},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2022},
pages = {3373-3379},
doi = {10.24963/IJCAI.2022/468},
url = {https://mlanthology.org/ijcai/2022/ouyang2022ijcai-weakly/}
}