On Socially Fair Low-Rank Approximation and Column Subset Selection
Abstract
Low-rank approximation and column subset selection are two fundamental and related problems that are applied across a wealth of machine learning applications. In this paper, we study the question of socially fair low-rank approximation and socially fair column subset selection, where the goal is to minimize the loss over all sub-populations of the data. We show that surprisingly, even constant-factor approximation to fair low-rank approximation requires exponential time under certain standard complexity hypotheses. On the positive side, we give an algorithm for fair low-rank approximation that, for a constant number of groups and constant-factor accuracy, runs in $2^{\text{poly}(k)}$ rather than the naive $n^{\text{poly}(k)}$, which is a substantial improvement when the dataset has a large number $n$ of observations. We then show that there exist bicriteria approximation algorithms for fair low-rank approximation and fair column subset selection that runs in polynomial time.
Cite
Text
Song et al. "On Socially Fair Low-Rank Approximation and Column Subset Selection." Neural Information Processing Systems, 2024. doi:10.52202/079017-2819Markdown
[Song et al. "On Socially Fair Low-Rank Approximation and Column Subset Selection." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/song2024neurips-socially/) doi:10.52202/079017-2819BibTeX
@inproceedings{song2024neurips-socially,
title = {{On Socially Fair Low-Rank Approximation and Column Subset Selection}},
author = {Song, Zhao and Vakilian, Ali and Woodruff, David P. and Zhou, Samson},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-2819},
url = {https://mlanthology.org/neurips/2024/song2024neurips-socially/}
}