Learning Representations by Maximizing Mutual Information Across Views

Philip Bachman, R Devon Hjelm, William Buchwalter

NeurIPS 2019 pp. 15535-15545

/neurips/2019/bachman2019neurips-learning/

Abstract

We propose an approach to self-supervised representation learning based on maximizing mutual information between features extracted from multiple views of a shared context. For example, one could produce multiple views of a local spatio-temporal context by observing it from different locations (e.g., camera positions within a scene), and via different modalities (e.g., tactile, auditory, or visual). Or, an ImageNet image could provide a context from which one produces multiple views by repeatedly applying data augmentation. Maximizing mutual information between features extracted from these views requires capturing information about high-level factors whose influence spans multiple views – e.g., presence of certain objects or occurrence of certain events. Following our proposed approach, we develop a model which learns image representations that significantly outperform prior methods on the tasks we consider. Most notably, using self-supervised learning, our model learns representations which achieve 68.1% accuracy on ImageNet using standard linear evaluation. This beats prior results by over 12% and concurrent results by 7%. When we extend our model to use mixture-based representations, segmentation behaviour emerges as a natural side-effect. Our code is available online: https://github.com/Philip-Bachman/amdim-public.

PDF NeurIPS Semantic Scholar

Cite

Text

Bachman et al. "Learning Representations by Maximizing Mutual Information Across Views." Neural Information Processing Systems, 2019.

Markdown

[Bachman et al. "Learning Representations by Maximizing Mutual Information Across Views." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/bachman2019neurips-learning/)

BibTeX

@inproceedings{bachman2019neurips-learning,
  title     = {{Learning Representations by Maximizing Mutual Information Across Views}},
  author    = {Bachman, Philip and Hjelm, R Devon and Buchwalter, William},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {15535-15545},
  url       = {https://mlanthology.org/neurips/2019/bachman2019neurips-learning/}
}