$\ell_p$-Regression in the Arbitrary Partition Model of Communication

Abstract

We consider the randomized communication complexity of the distributed $\ell_p$-regression problem in the coordinator model, for $p\in (0,2]$. In this problem, there is a coordinator and $s$ servers. The $i$-th server receives $A^i\in\{-M, -M+1, \ldots, M\}^{n\times d}$ and $b^i\in\{-M, -M+1, \ldots, M\}^n$ and the coordinator would like to find a $(1+\eps)$-approximate solution to $\min_{x\in\R^n} \norm{(\sum_i A^i)x - (\sum_i b^i)}_p$. Here $M \leq \poly(nd)$ for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $\tilde{\Theta}(sd^2 + sd/\epsilon)$ bits. For $p \in (1,2)$, we obtain an $\tilde{O}(sd^2/\eps + sd/\poly(\eps))$ upper bound. Notably, for $d$ sufficiently large, our leading order term only depends linearly on $1/\epsilon$ rather than quadratically. We also show communication lower bounds of $\Omega(sd^2 + sd/\eps^2)$ for $p\in (0,1]$ and $\Omega(sd^2 + sd/\eps)$ for $p\in (1,2]$. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020).

Cite

Text

Li et al. "$\ell_p$-Regression in the Arbitrary Partition Model of Communication." Conference on Learning Theory, 2023.

Markdown

[Li et al. "$\ell_p$-Regression in the Arbitrary Partition Model of Communication." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/li2023colt-pregression/)

BibTeX

@inproceedings{li2023colt-pregression,
  title     = {{$\ell_p$-Regression in the Arbitrary Partition Model of Communication}},
  author    = {Li, Yi and Lin, Honghao and Woodruff, David},
  booktitle = {Conference on Learning Theory},
  year      = {2023},
  pages     = {4902-4928},
  volume    = {195},
  url       = {https://mlanthology.org/colt/2023/li2023colt-pregression/}
}