Robust Estimation in Regression and Classification Methods for Large Dimensional Data

Zhang, Chunming; Zhu, Lixing; Shen, Yanbo

doi:10.1007/S10994-023-06349-2

Robust Estimation in Regression and Classification Methods for Large Dimensional Data

Chunming Zhang, Lixing Zhu, Yanbo Shen

MLJ 2023 pp. 3361-3411

doi:10.1007/S10994-023-06349-2 /mlj/2023/zhang2023mlj-robust-a/

Abstract

Statistical data analysis and machine learning heavily rely on error measures for regression, classification, and forecasting. Bregman divergence ( ${\text{BD}}$ BD ) is a widely used family of error measures, but it is not robust to outlying observations or high leverage points in large- and high-dimensional datasets. In this paper, we propose a new family of robust Bregman divergences called “ robust - ${\text{BD}}$ BD ” that are less sensitive to data outliers. We explore their suitability for sparse large-dimensional regression models with incompletely specified response variable distributions and propose a new estimate called the “ penalized robust - ${\text{BD}}$ BD estimate ” that achieves the same oracle property as ordinary non-robust penalized least-squares and penalized-likelihood estimates. We conduct extensive numerical experiments to evaluate the performance of the proposed penalized robust- ${\text{BD}}$ BD estimate and compare it with classical approaches, and show that our proposed method improves on existing approaches. Finally, we analyze a real dataset to illustrate the practicality of our proposed method. Our findings suggest that the proposed method can be a useful tool for robust statistical data analysis and machine learning in the presence of outliers and large-dimensional data.

PDF MLJ Semantic Scholar

Cite

Text

Zhang et al. "Robust Estimation in Regression and Classification Methods for Large Dimensional Data." Machine Learning, 2023. doi:10.1007/S10994-023-06349-2

Markdown

[Zhang et al. "Robust Estimation in Regression and Classification Methods for Large Dimensional Data." Machine Learning, 2023.](https://mlanthology.org/mlj/2023/zhang2023mlj-robust-a/) doi:10.1007/S10994-023-06349-2

BibTeX

@article{zhang2023mlj-robust-a,
  title     = {{Robust Estimation in Regression and Classification Methods for Large Dimensional Data}},
  author    = {Zhang, Chunming and Zhu, Lixing and Shen, Yanbo},
  journal   = {Machine Learning},
  year      = {2023},
  pages     = {3361-3411},
  doi       = {10.1007/S10994-023-06349-2},
  volume    = {112},
  url       = {https://mlanthology.org/mlj/2023/zhang2023mlj-robust-a/}
}