Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning

Abstract

Self-supervised learning excels in learning representations from large amounts of unlabeled data, demonstrating success across multiple data modalities. Yet, extending self-supervised learning to new modalities is non-trivial because the specifics of existing methods are tailored to each domain, such as domain-specific augmentations which reflect the invariances in the target task. While masked modeling is promising as a domain-agnostic framework for self-supervised learning because it does not rely on input augmentations, its mask sampling procedure remains domain-specific. We present Self-guided Masked Autoencoders (SMA), a fully domain-agnostic masked modeling method. SMA trains an attention based model using a masked modeling objective, by learning masks to sample without any domain-specific assumptions. We evaluate SMA on three self-supervised learning benchmarks in protein biology, chemical property prediction, and particle physics. We find SMA is capable of learning representations without domain-specific knowledge and achieves state-of-the-art performance on these three benchmarks.

Cite

Text

Xie et al. "Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning." International Conference on Learning Representations, 2024.

Markdown

[Xie et al. "Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/xie2024iclr-selfguided/)

BibTeX

@inproceedings{xie2024iclr-selfguided,
  title     = {{Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning}},
  author    = {Xie, Johnathan Wenjia and Lee, Yoonho and Chen, Annie S and Finn, Chelsea},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/xie2024iclr-selfguided/}
}