Non Mean Square Error Criteria for the Training of Learning Machines

Abstract

In recent papers, Miller, Goodman & Smyth (1991, 1993) provided conditions on the cost function used for the training of a neural network in order to ensure that the output of the network approximates the conditional expectation of the desired output, given the input. However, they only considered the single-output case. In this paper, we provide another, rather straightforward, proof of the same results, for the general multi-outputs case; all the developments being presented in the context of estimation theory. More precisely, among a class of "reasonable " performance criteria, we provide necessary and sufficient conditions on the cost function so that the optimal estimate is the conditional expectation of the desired outputs, whatever the noise characteristics affecting the data. We furthermore provide a short overview of related results from estimation theory, and verify numerically the developments by comparing the optimal estimator of several performance criteria. We must stress that, while all these results are stated for a neural network, they are however true in general for any learning machine that is trained in order to predict an output y in function of an input x.

Cite

Text

Saerens. "Non Mean Square Error Criteria for the Training of Learning Machines." International Conference on Machine Learning, 1996.

Markdown

[Saerens. "Non Mean Square Error Criteria for the Training of Learning Machines." International Conference on Machine Learning, 1996.](https://mlanthology.org/icml/1996/saerens1996icml-non/)

BibTeX

@inproceedings{saerens1996icml-non,
  title     = {{Non Mean Square Error Criteria for the Training of Learning Machines}},
  author    = {Saerens, Marco},
  booktitle = {International Conference on Machine Learning},
  year      = {1996},
  pages     = {427-434},
  url       = {https://mlanthology.org/icml/1996/saerens1996icml-non/}
}