On Biases in Estimating Multi-Valued Attributes

Abstract

We analyse the biases of eleven measures for estimating the quality of the multi-valued attributes. The values of information gain, J-measure, gini-index, and relevance tend to linearly increase with the number of values of an attribute. The values of gain-ratio, distance measure, Relief, and the weight of evidence decrease for informative attributes and increase for irrelevant attributes. The bias of the statistic tests based on the chi-square distribution is similar but these functions are not able to discriminate among the attributes of different quality. We also introduce a new function based on the MDL principle whose value slightly decreases with the increasing number of attribute’s values. 1

Cite

Text

Kononenko. "On Biases in Estimating Multi-Valued Attributes." International Joint Conference on Artificial Intelligence, 1995.

Markdown

[Kononenko. "On Biases in Estimating Multi-Valued Attributes." International Joint Conference on Artificial Intelligence, 1995.](https://mlanthology.org/ijcai/1995/kononenko1995ijcai-biases/)

BibTeX

@inproceedings{kononenko1995ijcai-biases,
  title     = {{On Biases in Estimating Multi-Valued Attributes}},
  author    = {Kononenko, Igor},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1995},
  pages     = {1034-1040},
  url       = {https://mlanthology.org/ijcai/1995/kononenko1995ijcai-biases/}
}