Minimum Message Length Grouping of Ordered Data
Abstract
Explicit segmentation is the partitioning of data into homogeneous regions by specifying cut-points. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Programming Algorithm (DPA). Oliver, Baxter and colleagues (1996, 1997, 1998) have applied the informationtheoretic Minimum Message Length (MML) principle to explicit segmentation. They have derived formulas for specifying cut-points imprecisely and have empirically shown their criterion to be superior to other segmentation methods (AIC, MDL and BIC). We use a simple MML criterion and Fisher’s DPA to perform numerical Bayesian (summing and) integration (using message lengths) over the cut-point location parameters. This gives an estimate of the number of segments, which we then use to estimate the cut-point positions and segment parameters by minimising the MML criterion. This is shown to have lower Kullback-Leibler distances on generated data.
Cite
Text
Fitzgibbon et al. "Minimum Message Length Grouping of Ordered Data." International Conference on Algorithmic Learning Theory, 2000. doi:10.1007/3-540-40992-0_5Markdown
[Fitzgibbon et al. "Minimum Message Length Grouping of Ordered Data." International Conference on Algorithmic Learning Theory, 2000.](https://mlanthology.org/alt/2000/fitzgibbon2000alt-minimum/) doi:10.1007/3-540-40992-0_5BibTeX
@inproceedings{fitzgibbon2000alt-minimum,
title = {{Minimum Message Length Grouping of Ordered Data}},
author = {Fitzgibbon, Leigh J. and Allison, Lloyd and Dowe, David L.},
booktitle = {International Conference on Algorithmic Learning Theory},
year = {2000},
pages = {56-70},
doi = {10.1007/3-540-40992-0_5},
url = {https://mlanthology.org/alt/2000/fitzgibbon2000alt-minimum/}
}