Learning to Extract Symbolic Knowledge from the World Wide Web

Abstract

The goal of the Web-KB project is to develop automatic methods for constructing and maintaining large knowledge bases whose contents mirror those of the World Wide Web. We argue for the feasibility of a system which, given a manually constructed ontology and a seed knowledge base comprising a set of labeled Web pages, learns to instantiate knowledge-base objects and relations from the Web. Such a system could construct a knowledge base supporting concept-oriented queries to the Web, or serve as a resource for Web-based problem solvers and search agents. At the heart of this task are two general sub-problems: identifying instances of the classes, and identifying instances of the relations that are defined in the ontology. The first sub-problem includes the task of recognizing cases in which individual Web pages correspond to the classes of interest, and the second sub-problem includes the task of identifying cases in which pairs of pages instantiate the ontology's relations. We present the results of initial experiments into the use of text classification and relational learning for these tasks, and sketch problems for future research.

Cite

Text

Craven et al. "Learning to Extract Symbolic Knowledge from the World Wide Web." AAAI Conference on Artificial Intelligence, 1998.

Markdown

[Craven et al. "Learning to Extract Symbolic Knowledge from the World Wide Web." AAAI Conference on Artificial Intelligence, 1998.](https://mlanthology.org/aaai/1998/craven1998aaai-learning/)

BibTeX

@inproceedings{craven1998aaai-learning,
  title     = {{Learning to Extract Symbolic Knowledge from the World Wide Web}},
  author    = {Craven, Mark and DiPasquo, Dan and Freitag, Dayne and McCallum, Andrew and Mitchell, Tom M. and Nigam, Kamal and Slattery, Seán},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1998},
  pages     = {509-516},
  url       = {https://mlanthology.org/aaai/1998/craven1998aaai-learning/}
}