Fast Cross-Validation

Abstract

Cross-validation (CV) is the most widely adopted approach for selecting the optimal model. However, the computation of CV  has high complexity due to multiple times of learner training, making it disabled for large scale model selection. In this paper, we present an approximate approach to CV based on the theoretical notion of Bouligand influence function (BIF) and the Nystr\"om method for kernel methods. We first establish the relationship between the theoretical notion of BIF and CV, and propose a method to approximate the CV via the Taylor expansion of BIF. Then, we provide a novel computing method to calculate the BIF for general distribution, and evaluate BIF for sample distribution. Finally, we use the Nystr\"om method to accelerate the computation of the BIF matrix for giving the finally approximate CV criterion. The proposed approximate CV requires training only once and is suitable for a wide variety of kernel methods. Experimental results on lots of datasets how that our approximate CV has no statistical discrepancy with the original CV, but can significantly improve the efficiency.

Cite

Text

Liu et al. "Fast Cross-Validation." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/346

Markdown

[Liu et al. "Fast Cross-Validation." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/liu2018ijcai-fast/) doi:10.24963/IJCAI.2018/346

BibTeX

@inproceedings{liu2018ijcai-fast,
  title     = {{Fast Cross-Validation}},
  author    = {Liu, Yong and Lin, Hailun and Ding, Lizhong and Wang, Weiping and Liao, Shizhong},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {2497-2503},
  doi       = {10.24963/IJCAI.2018/346},
  url       = {https://mlanthology.org/ijcai/2018/liu2018ijcai-fast/}
}