Morphological Annotation of a Large Spontaneous Speech Corpus in Japanese

Uchimoto, Kiyotaka; Isahara, Hitoshi

Morphological Annotation of a Large Spontaneous Speech Corpus in Japanese

IJCAI 2007 pp. 1731-1737

/ijcai/2007/uchimoto2007ijcai-morphological/

Abstract

We propose an efficient framework for human-aided morphological annotation of a large spontaneous speech corpus such as the Corpus of Spontaneous Japanese. In this framework, even when word units have several definitions in a given corpus, and not all words are found in a dictionary or in a training corpus, we can morphologically analyze the given corpus with high accuracy and low labor costs by detecting words not found in the dictionary and putting them into it. We can further reduce labor costs by expanding training corpora based on active learning.

PDF Semantic Scholar

Cite

Text

Uchimoto and Isahara. "Morphological Annotation of a Large Spontaneous Speech Corpus in Japanese." International Joint Conference on Artificial Intelligence, 2007.

Markdown

[Uchimoto and Isahara. "Morphological Annotation of a Large Spontaneous Speech Corpus in Japanese." International Joint Conference on Artificial Intelligence, 2007.](https://mlanthology.org/ijcai/2007/uchimoto2007ijcai-morphological/)

BibTeX

@inproceedings{uchimoto2007ijcai-morphological,
  title     = {{Morphological Annotation of a Large Spontaneous Speech Corpus in Japanese}},
  author    = {Uchimoto, Kiyotaka and Isahara, Hitoshi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {1731-1737},
  url       = {https://mlanthology.org/ijcai/2007/uchimoto2007ijcai-morphological/}
}