Bi-Matching Mechanism to Combat Long-Tail Senses of Word Sense Disambiguation
Abstract
The long-tail phenomenon of word sense distribution in linguistics causes Word Sense Disambiguation (WSD) to face both head senses with a large number of samples and tail senses with only a few samples. Traditional recognition methods are suitable for head senses with sufficient training samples, but they cannot effectively deal with tail senses. Inspired by the diverse memory and recognition abilities of children’s linguistic behavior, we propose a bi-matching mechanism approach for WSD. Considering that tail senses are often presented in the form of fixed collocations, a collocation feature matching method suitable for tail senses is designed; the traditional definition matching method is used for head senses; finally, the two matching methods are combined to construct a WSD model with the bi-matching mechanism (called Bi-MWSD). Bi-MWSD can effectively combat the difficulty of identifying the tail senses due to insufficient training samples. The experiments are implemented in the standard English all-words WSD evaluation framework and the training data augmented evaluation framework. The experimental results outperform the baseline models and achieve state-of-the-art performance under the data augmentation evaluation framework.
Cite
Text
Zhang et al. "Bi-Matching Mechanism to Combat Long-Tail Senses of Word Sense Disambiguation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26390-3_36Markdown
[Zhang et al. "Bi-Matching Mechanism to Combat Long-Tail Senses of Word Sense Disambiguation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/zhang2022ecmlpkdd-bimatching/) doi:10.1007/978-3-031-26390-3_36BibTeX
@inproceedings{zhang2022ecmlpkdd-bimatching,
title = {{Bi-Matching Mechanism to Combat Long-Tail Senses of Word Sense Disambiguation}},
author = {Zhang, Junwei and He, Ruifang and Guo, Fengyu},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2022},
pages = {621-637},
doi = {10.1007/978-3-031-26390-3_36},
url = {https://mlanthology.org/ecmlpkdd/2022/zhang2022ecmlpkdd-bimatching/}
}