Unsupervised Learning Under Latent Label Shift

Abstract

What sorts of structure might enable a learner to discover classes from unlabeled data? Traditional unsupervised learning approaches risk recovering incorrect classes based on spurious feature-space similarity. In this paper, we introduce unsupervised learning under Latent Label Shift (LLS), where label marginals $p_d(y)$ shift but class conditionals $p(\mathbf{x}|y)$ do not. This setting suggests a new principle for identifying classes: elements that shift together across domains belong to the same true class. For finite input spaces, we establish an isomorphism between LLS and topic modeling; for continuous data, we show that if each label's support contains a separable region, analogous to an anchor word, oracle access to $p(d|\mathbf{x})$ suffices to identify $p_d(y)$ and $p_d(y|\mathbf{x})$ up to permutation of latent labels. Thus motivated, we introduce a practical algorithm that leverages domain-discriminative models as follows: (i) push examples through domain discriminator $p(d|\mathbf{x})$; (ii) discretize the data by clustering examples in $p(d|\mathbf{x})$ space; (iii) perform non-negative matrix factorization on the discrete data; (iv) combine recovered $p(y|d)$ with discriminator outputs $p(d|\mathbf{x})$ to compute $p_d(y|\mathbf{x}) \; \forall d$. In semi-synthetic experiments, we show that our algorithm can use domain information to overcome a failure mode of standard unsupervised classification in which feature-space similarity does not indicate true groupings.

Cite

Text

Mani et al. "Unsupervised Learning Under Latent Label Shift." ICML 2022 Workshops: SCIS, 2022.

Markdown

[Mani et al. "Unsupervised Learning Under Latent Label Shift." ICML 2022 Workshops: SCIS, 2022.](https://mlanthology.org/icmlw/2022/mani2022icmlw-unsupervised/)

BibTeX

@inproceedings{mani2022icmlw-unsupervised,
  title     = {{Unsupervised Learning Under Latent Label Shift}},
  author    = {Mani, Pranav and Roberts, Manley and Garg, Saurabh and Lipton, Zachary Chase},
  booktitle = {ICML 2022 Workshops: SCIS},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/mani2022icmlw-unsupervised/}
}