Selecting Weighting Factors in Logarithmic Opinion Pools

Abstract

A simple linear averaging of the outputs of several networks as e.g. in bagging [3], seems to follow naturally from a bias/variance decomposition of the sum-squared error. The sum-squared error of the average model is a quadratic function of the weighting factors assigned to the networks in the ensemble [7], suggesting a quadratic programming algorithm for finding the "optimal" weighting factors. If we interpret the output of a network as a probability statement, the sum-squared error corresponds to minus the loglikelihood or the Kullback-Leibler divergence, and linear averaging of the out(cid:173) puts to logarithmic averaging of the probability statements: the logarithmic opinion pool. The crux of this paper is that this whole story about model aver(cid:173) aging, bias/variance decompositions, and quadratic programming to find the optimal weighting factors, is not specific for the sum(cid:173) squared error, but applies to the combination of probability state(cid:173) ments of any kind in a logarithmic opinion pool, as long as the Kullback-Leibler divergence plays the role of the error measure. As examples we treat model averaging for classification models under a cross-entropy error measure and models for estimating variances.

Cite

Text

Heskes. "Selecting Weighting Factors in Logarithmic Opinion Pools." Neural Information Processing Systems, 1997.

Markdown

[Heskes. "Selecting Weighting Factors in Logarithmic Opinion Pools." Neural Information Processing Systems, 1997.](https://mlanthology.org/neurips/1997/heskes1997neurips-selecting/)

BibTeX

@inproceedings{heskes1997neurips-selecting,
  title     = {{Selecting Weighting Factors in Logarithmic Opinion Pools}},
  author    = {Heskes, Tom},
  booktitle = {Neural Information Processing Systems},
  year      = {1997},
  pages     = {266-272},
  url       = {https://mlanthology.org/neurips/1997/heskes1997neurips-selecting/}
}