Robust Estimation in Regression and Classification Methods for Large Dimensional Data
Abstract
Statistical data analysis and machine learning heavily rely on error measures for regression, classification, and forecasting. Bregman divergence ( ${\text{BD}}$ BD ) is a widely used family of error measures, but it is not robust to outlying observations or high leverage points in large- and high-dimensional datasets. In this paper, we propose a new family of robust Bregman divergences called “ robust - ${\text{BD}}$ BD ” that are less sensitive to data outliers. We explore their suitability for sparse large-dimensional regression models with incompletely specified response variable distributions and propose a new estimate called the “ penalized robust - ${\text{BD}}$ BD estimate ” that achieves the same oracle property as ordinary non-robust penalized least-squares and penalized-likelihood estimates. We conduct extensive numerical experiments to evaluate the performance of the proposed penalized robust- ${\text{BD}}$ BD estimate and compare it with classical approaches, and show that our proposed method improves on existing approaches. Finally, we analyze a real dataset to illustrate the practicality of our proposed method. Our findings suggest that the proposed method can be a useful tool for robust statistical data analysis and machine learning in the presence of outliers and large-dimensional data.
Cite
Text
Zhang et al. "Robust Estimation in Regression and Classification Methods for Large Dimensional Data." Machine Learning, 2023. doi:10.1007/S10994-023-06349-2Markdown
[Zhang et al. "Robust Estimation in Regression and Classification Methods for Large Dimensional Data." Machine Learning, 2023.](https://mlanthology.org/mlj/2023/zhang2023mlj-robust-a/) doi:10.1007/S10994-023-06349-2BibTeX
@article{zhang2023mlj-robust-a,
title = {{Robust Estimation in Regression and Classification Methods for Large Dimensional Data}},
author = {Zhang, Chunming and Zhu, Lixing and Shen, Yanbo},
journal = {Machine Learning},
year = {2023},
pages = {3361-3411},
doi = {10.1007/S10994-023-06349-2},
volume = {112},
url = {https://mlanthology.org/mlj/2023/zhang2023mlj-robust-a/}
}