Constraint-Based Entity Matching

Shen, Warren; Li, Xin; Doan, AnHai

Constraint-Based Entity Matching

AAAI 2005 pp. 862-867

/aaai/2005/shen2005aaai-constraint/

Abstract

Entity matching is the problem of deciding if two given men-tions in the data, such as Helen Hunt and H. M. Hunt, refer to the same real-world entity. Numerous solutions have been developed, but they have not considered in depth the problem of exploiting integrity constraints that frequently ex-ist in the domains. Examples of such constraints include a mention with age two cannot match a mention with salary 200K and if two paper citations match, then their authors are likely to match in the same order. In this paper we de-scribe a probabilistic solution to entity matching that exploits such constraints to improve matching accuracy. At the heart of the solution is a generative model that takes into account the constraints during the generation process, and provides well-dened interpretations of the constraints. We describe a novel combination of EM and relaxation labeling algorithms that efciently learns the model, thereby matching mentions in an unsupervised way, without the need for annotated train-ing data. Experiments on several real-world domains show that our solution can exploit constraints to signicantly im-prove matching accuracy, by 3-12 % F-1, and that the solution scales up to large data sets.

PDF AAAI Semantic Scholar

Cite

Text

Shen et al. "Constraint-Based Entity Matching." AAAI Conference on Artificial Intelligence, 2005.

Markdown

[Shen et al. "Constraint-Based Entity Matching." AAAI Conference on Artificial Intelligence, 2005.](https://mlanthology.org/aaai/2005/shen2005aaai-constraint/)

BibTeX

@inproceedings{shen2005aaai-constraint,
  title     = {{Constraint-Based Entity Matching}},
  author    = {Shen, Warren and Li, Xin and Doan, AnHai},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2005},
  pages     = {862-867},
  url       = {https://mlanthology.org/aaai/2005/shen2005aaai-constraint/}
}