Sparse Projection Oblique Randomer Forests
Abstract
Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that is, trees that split only along feature dimensions. In contrast, many recent extensions to decision forests are based on axis-oblique splits. Unfortunately, these extensions forfeit one or more of the favorable properties of decision forests based on axis-aligned splits, such as robustness to many noise dimensions, interpretability, or computational efficiency. We introduce yet another decision forest, called “Sparse Projection Oblique Randomer Forests” (SPORF). SPORF trees recursively split along very sparse random projections. Our method significantly improves accuracy over existing state-of-the-art algorithms on a standard benchmark suite for classification with $>100$ problems of varying dimension, sample size, and number of classes. To illustrate how SPORF addresses the limitations of both axis-aligned and existing oblique decision forest methods, we conduct extensive simulated experiments. SPORF typically yields improved performance over existing decision forest methods, while mitigating computational efficiency and scalability and maintaining interpretability. Very sparse random projections can be incorporated into gradient boosted trees to obtain potentially similar gains.
Cite
Text
Tomita et al. "Sparse Projection Oblique Randomer Forests." Journal of Machine Learning Research, 2020.Markdown
[Tomita et al. "Sparse Projection Oblique Randomer Forests." Journal of Machine Learning Research, 2020.](https://mlanthology.org/jmlr/2020/tomita2020jmlr-sparse/)BibTeX
@article{tomita2020jmlr-sparse,
title = {{Sparse Projection Oblique Randomer Forests}},
author = {Tomita, Tyler M. and Browne, James and Shen, Cencheng and Chung, Jaewon and Patsolic, Jesse L. and Falk, Benjamin and Priebe, Carey E. and Yim, Jason and Burns, Randal and Maggioni, Mauro and Vogelstein, Joshua T.},
journal = {Journal of Machine Learning Research},
year = {2020},
pages = {1-39},
volume = {21},
url = {https://mlanthology.org/jmlr/2020/tomita2020jmlr-sparse/}
}