A Refined Margin Distribution Analysis for Forest Representation Learning

Abstract

In this paper, we formulate the forest representation learning approach called \textsc{CasDF} as an additive model which boosts the augmented feature instead of the prediction. We substantially improve the upper bound of the generalization gap from $\mathcal{O}(\sqrt{\ln m/m})$ to $\mathcal{O}(\ln m/m)$, while the margin ratio of the margin standard deviation to the margin mean is sufficiently small. This tighter upper bound inspires us to optimize the ratio. Therefore, we design a margin distribution reweighting approach for deep forest to achieve a small margin ratio by boosting the augmented feature. Experiments confirm the correlation between the margin distribution and generalization performance. We remark that this study offers a novel understanding of \textsc{CasDF} from the perspective of the margin theory and further guides the layer-by-layer forest representation learning.

Cite

Text

Lyu et al. "A Refined Margin Distribution Analysis for Forest Representation Learning." Neural Information Processing Systems, 2019.

Markdown

[Lyu et al. "A Refined Margin Distribution Analysis for Forest Representation Learning." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/lyu2019neurips-refined/)

BibTeX

@inproceedings{lyu2019neurips-refined,
  title     = {{A Refined Margin Distribution Analysis for Forest Representation Learning}},
  author    = {Lyu, Shen-Huan and Yang, Liang and Zhou, Zhi-Hua},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {5530-5540},
  url       = {https://mlanthology.org/neurips/2019/lyu2019neurips-refined/}
}