Avoiding Bias When Aggregating Relational Data with Degree Disparity

Jensen, David D.; Neville, Jennifer; Hay, Michael

Avoiding Bias When Aggregating Relational Data with Degree Disparity

David D. Jensen, Jennifer Neville, Michael Hay

ICML 2003 pp. 274-281

/icml/2003/jensen2003icml-avoiding/

Abstract

A common characteristic of relational data sets --degree disparity--can lead relational learning algorithms to discover misleading correlations. Degree disparity occurs when the frequency of a relation is correlated with the values of the target variable. In such cases, aggregation functions used by many relational learning algorithms will result in misleading correlations and added complexity in models. We examine this problem through a combination of simulations and experiments. We show how two novel hypothesis testing procedures can adjust for the effects of using aggregation functions in the presence of degree disparity. ICML Proceedings of the Twentieth International Conference on Machine Learning

PDF ICML Semantic Scholar

Cite

Text

Jensen et al. "Avoiding Bias When Aggregating Relational Data with Degree Disparity." International Conference on Machine Learning, 2003.

Markdown

[Jensen et al. "Avoiding Bias When Aggregating Relational Data with Degree Disparity." International Conference on Machine Learning, 2003.](https://mlanthology.org/icml/2003/jensen2003icml-avoiding/)

BibTeX

@inproceedings{jensen2003icml-avoiding,
  title     = {{Avoiding Bias When Aggregating Relational Data with Degree Disparity}},
  author    = {Jensen, David D. and Neville, Jennifer and Hay, Michael},
  booktitle = {International Conference on Machine Learning},
  year      = {2003},
  pages     = {274-281},
  url       = {https://mlanthology.org/icml/2003/jensen2003icml-avoiding/}
}