Minimum Description Length, Regularization, and Multimodal Data
Abstract
Relationships between clustering, description length, and regularization are pointed out, motivating the introduction of a cost function with a description length interpretation and the unusual and useful property of having its minimum approximated by the densest mode of a distribution. A simple inverse kinematics example is used to demonstrate that this property can be used to select and learn one branch of a multivalued mapping. This property is also used to develop a method for setting regularization parameters according to the scale on which structure is exhibited in the training data. The regularization technique is demonstrated on two real data sets, a classification problem and a regression problem.
Cite
Text
Rohwer and Van der Rest. "Minimum Description Length, Regularization, and Multimodal Data." Neural Computation, 1996. doi:10.1162/NECO.1996.8.3.595Markdown
[Rohwer and Van der Rest. "Minimum Description Length, Regularization, and Multimodal Data." Neural Computation, 1996.](https://mlanthology.org/neco/1996/rohwer1996neco-minimum/) doi:10.1162/NECO.1996.8.3.595BibTeX
@article{rohwer1996neco-minimum,
title = {{Minimum Description Length, Regularization, and Multimodal Data}},
author = {Rohwer, Richard and Van der Rest, John C.},
journal = {Neural Computation},
year = {1996},
pages = {595-609},
doi = {10.1162/NECO.1996.8.3.595},
volume = {8},
url = {https://mlanthology.org/neco/1996/rohwer1996neco-minimum/}
}