Concept Boundary Detection for Speeding up SVMs
Abstract
Support Vector Machines (SVMs) suffer from an O(n2) training cost, where n denotes the number of training instances. In this paper, we propose an algorithm to select boundary instances as training data to substantially reduce n. Our proposed algorithm is motivated by the result of (Burges, 1999) that, removing non-support vectors from the training set does not change SVM training results. Our algorithm eliminates instances that are likely to be non-support vectors. In the concept-independent preprocessing step of our algorithm, we prepare nearest-neighbor lists for training instances. In the concept-specific sampling step, we can then effectively select useful training data for each target concept. Empirical studies show our algorithm to be effective in reducing n, outperforming other competing downsampling algorithms without significantly compromising testing accuracy.
Cite
Text
Panda et al. "Concept Boundary Detection for Speeding up SVMs." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143930Markdown
[Panda et al. "Concept Boundary Detection for Speeding up SVMs." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/panda2006icml-concept/) doi:10.1145/1143844.1143930BibTeX
@inproceedings{panda2006icml-concept,
title = {{Concept Boundary Detection for Speeding up SVMs}},
author = {Panda, Navneet and Chang, Edward Y. and Wu, Gang},
booktitle = {International Conference on Machine Learning},
year = {2006},
pages = {681-688},
doi = {10.1145/1143844.1143930},
url = {https://mlanthology.org/icml/2006/panda2006icml-concept/}
}