Minimum Description Length, Regularization, and Multimodal Data

Abstract

Relationships between clustering, description length, and regularization are pointed out, motivating the introduction of a cost function with a description length interpretation and the unusual and useful property of having its minimum approximated by the densest mode of a distribution. A simple inverse kinematics example is used to demonstrate that this property can be used to select and learn one branch of a multivalued mapping. This property is also used to develop a method for setting regularization parameters according to the scale on which structure is exhibited in the training data. The regularization technique is demonstrated on two real data sets, a classification problem and a regression problem.

Cite

Text

Rohwer and Van der Rest. "Minimum Description Length, Regularization, and Multimodal Data." Neural Computation, 1996. doi:10.1162/NECO.1996.8.3.595

Markdown

[Rohwer and Van der Rest. "Minimum Description Length, Regularization, and Multimodal Data." Neural Computation, 1996.](https://mlanthology.org/neco/1996/rohwer1996neco-minimum/) doi:10.1162/NECO.1996.8.3.595

BibTeX

@article{rohwer1996neco-minimum,
  title     = {{Minimum Description Length, Regularization, and Multimodal Data}},
  author    = {Rohwer, Richard and Van der Rest, John C.},
  journal   = {Neural Computation},
  year      = {1996},
  pages     = {595-609},
  doi       = {10.1162/NECO.1996.8.3.595},
  volume    = {8},
  url       = {https://mlanthology.org/neco/1996/rohwer1996neco-minimum/}
}