On Biases in Estimating Multi-Valued Attributes
Abstract
We analyse the biases of eleven measures for estimating the quality of the multi-valued attributes. The values of information gain, J-measure, gini-index, and relevance tend to linearly increase with the number of values of an attribute. The values of gain-ratio, distance measure, Relief, and the weight of evidence decrease for informative attributes and increase for irrelevant attributes. The bias of the statistic tests based on the chi-square distribution is similar but these functions are not able to discriminate among the attributes of different quality. We also introduce a new function based on the MDL principle whose value slightly decreases with the increasing number of attribute’s values. 1
Cite
Text
Kononenko. "On Biases in Estimating Multi-Valued Attributes." International Joint Conference on Artificial Intelligence, 1995.Markdown
[Kononenko. "On Biases in Estimating Multi-Valued Attributes." International Joint Conference on Artificial Intelligence, 1995.](https://mlanthology.org/ijcai/1995/kononenko1995ijcai-biases/)BibTeX
@inproceedings{kononenko1995ijcai-biases,
title = {{On Biases in Estimating Multi-Valued Attributes}},
author = {Kononenko, Igor},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1995},
pages = {1034-1040},
url = {https://mlanthology.org/ijcai/1995/kononenko1995ijcai-biases/}
}