Extraction of Hierarchies Based on Inclusion of Co-Occurring Words with Frequency Information

Abstract

In this paper, we propose a method of automatically extracting word hierarchies based on the inclusion relations of word appearance patterns in corpora. We applied the complementary similarity measure (CSM) to determine a hierarchical structure of word meanings. The CSM is a similarity measure developed for recognizing degraded machine-printed text. There are CSMs for both binary and gray-scale images. The CSM for binary images has been applied to estimate one-to-many relations, such as superordinate-subordinate relations, and to extract word hierarchies. However, the CSM for gray-scale images has not been applied to natural language processing. Here, we apply the latter to extract word hierarchies from corpora. To do this, we used frequency information for co-occurring words, which is not considered when using the CSM for binary images. We compared our hierarchies with those obtained using the CSM for binary images, and evaluated them by measuring their degree of agreement with the EDR electronic dictionary. 1

Cite

Text

Yamamoto et al. "Extraction of Hierarchies Based on Inclusion of Co-Occurring Words with Frequency Information." International Joint Conference on Artificial Intelligence, 2005.

Markdown

[Yamamoto et al. "Extraction of Hierarchies Based on Inclusion of Co-Occurring Words with Frequency Information." International Joint Conference on Artificial Intelligence, 2005.](https://mlanthology.org/ijcai/2005/yamamoto2005ijcai-extraction/)

BibTeX

@inproceedings{yamamoto2005ijcai-extraction,
  title     = {{Extraction of Hierarchies Based on Inclusion of Co-Occurring Words with Frequency Information}},
  author    = {Yamamoto, Eiko and Kanzaki, Kyoko and Isahara, Hitoshi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2005},
  pages     = {1166-1174},
  url       = {https://mlanthology.org/ijcai/2005/yamamoto2005ijcai-extraction/}
}