Imbalanced Mixed Linear Regression
Abstract
We consider the problem of mixed linear regression (MLR), where each observed sample belongs to one of $K$ unknown linear models. In practical applications, the mixture of the $K$ models may be imbalanced with a significantly different number of samples from each model. Unfortunately, most MLR methods do not perform well in such settings. Motivated by this practical challenge, in this work we propose Mix-IRLS, a novel, simple and fast algorithm for MLR with excellent performance on both balanced and imbalanced mixtures.In contrast to popular approaches that recover the $K$ models simultaneously, Mix-IRLS does it sequentially using tools from robust regression. Empirically, beyond imbalanced mixtures, Mix-IRLS succeeds in a broad range of additional settings where other methods fail, including small sample sizes, presence of outliers, and an unknown number of models $K$. Furthermore, Mix-IRLS outperforms competing methods on several real-world datasets, in some cases by a large margin. We complement our empirical results by deriving a recovery guarantee for Mix-IRLS, which highlights its advantage on imbalanced mixtures.
Cite
Text
Zilber and Nadler. "Imbalanced Mixed Linear Regression." Neural Information Processing Systems, 2023.Markdown
[Zilber and Nadler. "Imbalanced Mixed Linear Regression." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/zilber2023neurips-imbalanced/)BibTeX
@inproceedings{zilber2023neurips-imbalanced,
title = {{Imbalanced Mixed Linear Regression}},
author = {Zilber, Pini and Nadler, Boaz},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/zilber2023neurips-imbalanced/}
}