Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction

Abstract

Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for machine learning. We present an algorithm, RAPIER, that uses pairs of sample documents and filled templates to induce pattern-match rules that directly extract fillers for the slots in the template. RAPIER is a bottom-up learning algorithm that incorporates techniques from several inductive logic programming systems. We have implemented the algorithm in a system that allows patterns to have constraints on the words, part-of-speech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.

Cite

Text

Califf and Mooney. "Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction." Journal of Machine Learning Research, 2003.

Markdown

[Califf and Mooney. "Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction." Journal of Machine Learning Research, 2003.](https://mlanthology.org/jmlr/2003/califf2003jmlr-bottomup/)

BibTeX

@article{califf2003jmlr-bottomup,
  title     = {{Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction}},
  author    = {Califf, Mary Elaine and Mooney, Raymond J.},
  journal   = {Journal of Machine Learning Research},
  year      = {2003},
  pages     = {177-210},
  volume    = {4},
  url       = {https://mlanthology.org/jmlr/2003/califf2003jmlr-bottomup/}
}