On Prediction by Data Compression

Abstract

Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff.

Cite

Text

Vitányi and Li. "On Prediction by Data Compression." European Conference on Machine Learning, 1997. doi:10.1007/3-540-62858-4_69

Markdown

[Vitányi and Li. "On Prediction by Data Compression." European Conference on Machine Learning, 1997.](https://mlanthology.org/ecmlpkdd/1997/vitanyi1997ecml-prediction/) doi:10.1007/3-540-62858-4_69

BibTeX

@inproceedings{vitanyi1997ecml-prediction,
  title     = {{On Prediction by Data Compression}},
  author    = {Vitányi, Paul M. B. and Li, Ming},
  booktitle = {European Conference on Machine Learning},
  year      = {1997},
  pages     = {14-30},
  doi       = {10.1007/3-540-62858-4_69},
  url       = {https://mlanthology.org/ecmlpkdd/1997/vitanyi1997ecml-prediction/}
}