Robust Clustering Using Gaussian Mixtures in the Presence of Cellwise Outliers

Abstract

In this paper we propose a novel algorithm for robust estimation of Gaussian Mixture Model (GMM) parameters and clustering that explicitly accounts for cell outliers. To achieve this, the proposed algorithm minimizes a penalized negative log-likelihood function where the penalty term is derived via the false discovery rate principle. The penalized negative log-likelihood function is cyclically minimized over outlier positions and the GMM parameters. Furthermore, the minimization over the GMM parameters is done using the majorization minimization framework: specifically we minimize a tight upper bound on the negative log-likelihood function which decouples into simpler optimization subproblems that can be solved efficiently. We present several numerical simulation studies comprising experiments aimed at evaluating the performance of the proposed method on synthetic as well as real world data and at systematically comparing it with state-of-the-art robust techniques in different scenarios. The simulation studies demonstrate that our approach effectively addresses the challenges inherent in parameter estimation of GMM and clustering in contaminated data environments.

Cite

Text

Rajpurohit et al. "Robust Clustering Using Gaussian Mixtures in the Presence of Cellwise Outliers." Transactions on Machine Learning Research, 2026.

Markdown

[Rajpurohit et al. "Robust Clustering Using Gaussian Mixtures in the Presence of Cellwise Outliers." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/rajpurohit2026tmlr-robust/)

BibTeX

@article{rajpurohit2026tmlr-robust,
  title     = {{Robust Clustering Using Gaussian Mixtures in the Presence of Cellwise Outliers}},
  author    = {Rajpurohit, Pushpendra and Stoica, Petre and Babu, Prabhu},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/rajpurohit2026tmlr-robust/}
}