Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

Abstract

Obtaining rigorous statistical guarantees for generalization under distribution shift remains an open and active research area. We study a setting we call \emph{combinatorial distribution shift}, where (a) under the test- and training-distributions, the labels $z$ are determined by pairs of features $(x,y)$, (b) the training distribution has coverage of certain \emph{marginal} distributions over $x$ and $y$ separately, but (c) the test distribution involves examples from a product distribution over $(x,y)$ that is \emph{not} covered by the training distribution. Focusing on the special case where the labels are given by \emph{bilinear embeddings} into a Hilbert space $\mathcal H$: $\mathbb{E}[z \mid x,y ]=⟨f_{\star}(x),g_{\star}(y)\rangle_{\mathcal{H}}$, we aim to extrapolate to a test distribution domain that is not covered in training, or \emph{bilinear combinatorial extrapolation}. Our setting generalizes a special case of matrix completion from missing-not-at-random data, for which all existing results require the ground-truth matrices to be either \emph{exactly low-rank}, or to exhibit very sharp spectral cutoffs. In this work, we develop a series of theoretical results that enable bilinear combinatorial extrapolation under \emph{gradual} spectral decay as observed in typical high-dimensional data, including novel algorithms, generalization guarantees, and linear-algebraic results. A key tool is a novel perturbation bound for the rank-$k$ singular value decomposition approximations between two matrices that depends on the \emph{relative} spectral gap rather than the \emph{absolute} spectral gap, a result we think may be of broader independent interest.

Cite

Text

Simchowitz et al. "Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective." Conference on Learning Theory, 2023.

Markdown

[Simchowitz et al. "Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/simchowitz2023colt-tackling/)

BibTeX

@inproceedings{simchowitz2023colt-tackling,
  title     = {{Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective}},
  author    = {Simchowitz, Max and Gupta, Abhishek and Zhang, Kaiqing},
  booktitle = {Conference on Learning Theory},
  year      = {2023},
  pages     = {3356-3468},
  volume    = {195},
  url       = {https://mlanthology.org/colt/2023/simchowitz2023colt-tackling/}
}