Interfacing Issues for Information Extraction

Abstract

Traditional approaches to information extraction implicitly assume that many elements of the task are static — the user’s query, and the description of domain and corpus, for example. We believe that in many real situations, however, this assumption does not hold and it is important to consider how the system could best support interaction with the user when the assumption breaks down. Current goals in the information extraction community are for the system to produce accurate results while being easy to retrain and port to a new domain. We seek to extend current approaches to handle dynamic elements of the problem. “Evolving queries”, discussed in the information retrieval (IR) literature, need to be supported by information extraction (IE) systems; IR and IE are both, after all, tools for gathering information from documents in response to a user query. When a casual user — neither an expert in the use of the system, nor in the domain — engages in any information gathering task, there will be an initial phase of investigation and discovery during which the user becomes familiar with the system, the domain, and the documents in the corpus and the user’s query may change over time or evolve. For example, a user may have a query about terrorist activities, asking for the names of perpetrators and the locations of targets; an interim system output prompts the user to refine the query, redefining terrorist activities as involving only a subset of weapons while generalizing to allow for additional (e.g., government) perpetrators. The query is not the only element that may change over time; certainly the domain evolves as additional documents are processed. As well, when the corpus is very large or dynamic (e.g., the Internet), the corpus itself may be seen as evolving — rules for mapping text patterns to query items that apply at one time or for one portion of the corpus no longer apply for another. To provide more robust support for information extraction in a dynamic environment, we consider such issues as:

Cite

Text

Vanderheyden and Cohen. "Interfacing Issues for Information Extraction." AAAI Conference on Artificial Intelligence, 2000.

Markdown

[Vanderheyden and Cohen. "Interfacing Issues for Information Extraction." AAAI Conference on Artificial Intelligence, 2000.](https://mlanthology.org/aaai/2000/vanderheyden2000aaai-interfacing/)

BibTeX

@inproceedings{vanderheyden2000aaai-interfacing,
  title     = {{Interfacing Issues for Information Extraction}},
  author    = {Vanderheyden, Peter B. and Cohen, Robin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2000},
  pages     = {1096},
  url       = {https://mlanthology.org/aaai/2000/vanderheyden2000aaai-interfacing/}
}