Faster Boosting with Smaller Memory

Abstract

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.

Cite

Text

Alafate and Freund. "Faster Boosting with Smaller Memory." Neural Information Processing Systems, 2019.

Markdown

[Alafate and Freund. "Faster Boosting with Smaller Memory." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/alafate2019neurips-faster/)

BibTeX

@inproceedings{alafate2019neurips-faster,
  title     = {{Faster Boosting with Smaller Memory}},
  author    = {Alafate, Julaiti and Freund, Yoav S},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {11371-11380},
  url       = {https://mlanthology.org/neurips/2019/alafate2019neurips-faster/}
}