An Efficient Bayes Error Rate Estimation Method
Abstract
The Bayes error rate is the lowest error rate that any classifier can achieve, and estimating the Bayes error rate is a generally acknowledged difficult problem in machine learning. Recently, a consistent estimator of the Bayes error rate has been proposed, and the method to calculate this estimator is called BN-BER. Although the consistency of the estimator has been proven, whether the estimator is unbiased remains to be analyzed, because consistency and unbiasedness are two important properties to measure the effectiveness of an estimator. Besides, the time and space complexity of BN-BER are high, resulting in high runnning costs for large-scale data. To address the above issues, we first prove the unbiasedness of the estimator. Subsequently, addressing the high time and space complexity of BN-BER, we propose an approach for estimating Bayes error rates based on hierarchical k -means clustering and approximate k -nearest neighbor algorithm. Through analysis, we found that the time and space complexity of the proposed method is $O(n(\log _{2}n)^2)$ O ( n ( log 2 n ) 2 ) , whereas the time and space complexity of the BN-BER method is $O(n^2)$ O ( n 2 ) . The effectiveness and efficiency of the proposed method are verified on a large number of synthetic datasets and benchmark datasets.
Cite
Text
Chen et al. "An Efficient Bayes Error Rate Estimation Method." Machine Learning, 2025. doi:10.1007/S10994-025-06761-WMarkdown
[Chen et al. "An Efficient Bayes Error Rate Estimation Method." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/chen2025mlj-efficient/) doi:10.1007/S10994-025-06761-WBibTeX
@article{chen2025mlj-efficient,
title = {{An Efficient Bayes Error Rate Estimation Method}},
author = {Chen, Qingqiang and Cao, Fuyuan and Xing, Ying and Liang, Jiye},
journal = {Machine Learning},
year = {2025},
pages = {134},
doi = {10.1007/S10994-025-06761-W},
volume = {114},
url = {https://mlanthology.org/mlj/2025/chen2025mlj-efficient/}
}