Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions

Abstract

We describe an approach to extract attribute-value pairs from product descriptions. This allows us to represent products as sets of such attribute-value pairs to augment product databases. Such a representation is useful for a variety of tasks where treating a product as a set of attribute-value pairs is more useful than as an atomic entity. Examples of such applications include product recommendations, product comparison, and demand forecasting. We formulate the extraction as a classification problem and use a semi-supervised algorithm (co-EM) along with (Naive Bayes). The extraction system requires very little initial user supervision: using unlabeled data, we automatically extract an initial seed list that serves as training data for the supervised and semi-supervised classification algorithms. Finally, the extracted attributes and values are linked to form pairs using dependency information and co-location scores. We present promising results on product descriptions in two categories of sporting goods.

Cite

Text

Probst et al. "Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions." International Joint Conference on Artificial Intelligence, 2007.

Markdown

[Probst et al. "Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions." International Joint Conference on Artificial Intelligence, 2007.](https://mlanthology.org/ijcai/2007/probst2007ijcai-semi/)

BibTeX

@inproceedings{probst2007ijcai-semi,
  title     = {{Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions}},
  author    = {Probst, Katharina and Ghani, Rayid and Krema, Marko and Fano, Andrew E. and Liu, Yan},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {2838-2843},
  url       = {https://mlanthology.org/ijcai/2007/probst2007ijcai-semi/}
}