Wrap-up: A Trainable Discourse Module for Information Extraction
Abstract
The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse component that makes intersentential inferences and identifies logical relations among information extracted from the text. Previous corpus-based approaches were limited to lower level processing such as part-of-speech tagging, lexical disambiguation, and dictionary construction. Wrap-Up is fully trainable, and not only automatically decides what classifiers are needed, but even derives the feature set for each classifier automatically. Performance equals that of a partially trainable discourse module requiring manual customization for each domain.
Cite
Text
Soderland and Lehnert. "Wrap-up: A Trainable Discourse Module for Information Extraction." Journal of Artificial Intelligence Research, 1994. doi:10.1613/JAIR.68Markdown
[Soderland and Lehnert. "Wrap-up: A Trainable Discourse Module for Information Extraction." Journal of Artificial Intelligence Research, 1994.](https://mlanthology.org/jair/1994/soderland1994jair-wrapup/) doi:10.1613/JAIR.68BibTeX
@article{soderland1994jair-wrapup,
title = {{Wrap-up: A Trainable Discourse Module for Information Extraction}},
author = {Soderland, Stephen and Lehnert, Wendy G.},
journal = {Journal of Artificial Intelligence Research},
year = {1994},
pages = {131-158},
doi = {10.1613/JAIR.68},
volume = {2},
url = {https://mlanthology.org/jair/1994/soderland1994jair-wrapup/}
}