Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers
Abstract
The hierarchical mixture of experts architecture provides a flexible procedure for implementing classification algorithms. The classification is obtained by a recursive soft partition of the feature space in a data-driven fashion. Such a procedure enables local classification where several experts are used, each of which is assigned with the task of classification over some subspace of the feature space. In this work, we provide data-dependent generalization error bounds for this class of models, which lead to effective procedures for performing model selection. Tight bounds are particularly important here, because the model is highly parameterized. The theoretical results are complemented with numerical experiments based on a randomized algorithm, which mitigates the effects of local minima which plague other approaches such as the expectation-maximization algorithm.
Cite
Text
Azran and Meir. "Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers." Annual Conference on Computational Learning Theory, 2004. doi:10.1007/978-3-540-27819-1_30Markdown
[Azran and Meir. "Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers." Annual Conference on Computational Learning Theory, 2004.](https://mlanthology.org/colt/2004/azran2004colt-data/) doi:10.1007/978-3-540-27819-1_30BibTeX
@inproceedings{azran2004colt-data,
title = {{Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers}},
author = {Azran, Arik and Meir, Ron},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2004},
pages = {427-441},
doi = {10.1007/978-3-540-27819-1_30},
url = {https://mlanthology.org/colt/2004/azran2004colt-data/}
}