Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures

Abstract

We study clustering under anisotropic Gaussian Mixture Models (GMMs), where covariance matrices from different clusters are unknown and are not necessarily the identity matrix. We analyze two anisotropic scenarios: homogeneous, with identical covariance matrices, and heterogeneous, with distinct matrices per cluster. For these models, we derive minimax lower bounds that illustrate the critical influence of covariance structures on clustering accuracy. To solve the clustering problem, we consider a variant of Lloyd's algorithm, adapted to estimate and utilize covariance information iteratively. We prove that the adjusted algorithm not only achieves the minimax optimality but also converges within a logarithmic number of iterations, thus bridging the gap between theoretical guarantees and practical efficiency.

Cite

Text

Chen and Zhang. "Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures." Neural Information Processing Systems, 2024. doi:10.52202/079017-3610

Markdown

[Chen and Zhang. "Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/chen2024neurips-achieving/) doi:10.52202/079017-3610

BibTeX

@inproceedings{chen2024neurips-achieving,
  title     = {{Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures}},
  author    = {Chen, Xin and Zhang, Anderson Ye},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3610},
  url       = {https://mlanthology.org/neurips/2024/chen2024neurips-achieving/}
}