Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers

Abstract

The hierarchical mixture of experts architecture provides a flexible procedure for implementing classification algorithms. The classification is obtained by a recursive soft partition of the feature space in a data-driven fashion. Such a procedure enables local classification where several experts are used, each of which is assigned with the task of classification over some subspace of the feature space. In this work, we provide data-dependent generalization error bounds for this class of models, which lead to effective procedures for performing model selection. Tight bounds are particularly important here, because the model is highly parameterized. The theoretical results are complemented with numerical experiments based on a randomized algorithm, which mitigates the effects of local minima which plague other approaches such as the expectation-maximization algorithm.

Cite

Text

Azran and Meir. "Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers." Annual Conference on Computational Learning Theory, 2004. doi:10.1007/978-3-540-27819-1_30

Markdown

[Azran and Meir. "Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers." Annual Conference on Computational Learning Theory, 2004.](https://mlanthology.org/colt/2004/azran2004colt-data/) doi:10.1007/978-3-540-27819-1_30

BibTeX

@inproceedings{azran2004colt-data,
  title     = {{Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers}},
  author    = {Azran, Arik and Meir, Ron},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2004},
  pages     = {427-441},
  doi       = {10.1007/978-3-540-27819-1_30},
  url       = {https://mlanthology.org/colt/2004/azran2004colt-data/}
}