DailyMAE: Towards Pretraining Masked Autoencoders in One Day

Wu, Jiantao; Mo, Shentong; Atito, Sara; Feng, Zhenhua; Kittler, Josef; Awais, Muhammad

doi:10.1007/978-3-031-91979-4_12

DailyMAE: Towards Pretraining Masked Autoencoders in One Day

Jiantao Wu, Shentong Mo, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais

ECCVW 2024 pp. 131-149

doi:10.1007/978-3-031-91979-4_12 /eccvw/2024/wu2024eccvw-dailymae/

Abstract

Recently, masked image modeling (MIM), an important self-supervised learning (SSL) method, has drawn attention for its effectiveness in learning data representation from unlabeled data. Numerous studies underscore the advantages of MIM, highlighting how models pretrained on extensive datasets can enhance the performance of downstream tasks. However, the high computational demands of pretraining pose significant challenges, particularly within academic environments, thereby impeding the SSL research progress. In this study, we propose efficient training recipes for MIM based SSL that focuses on mitigating data loading bottlenecks and employing progressive training techniques and other tricks to closely maintain pretraining performance. Our library enables the training of a MAE-Base/16 model on the ImageNet 1K dataset for 800 epochs within just 18 h, using a single machine equipped with 8 A100 GPUs. By achieving speed gains of up to 5.8 times, this work not only demonstrates the feasibility of conducting high-efficiency SSL training but also paves the way for broader accessibility and promotes advancement in SSL research particularly for prototyping and initial testing of SSL ideas. Code is available at https://github.com/erow/EfficientSSL .

PDF ECCVW Semantic Scholar

Cite

Text

Wu et al. "DailyMAE: Towards Pretraining Masked Autoencoders in One Day." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91979-4_12

Markdown

[Wu et al. "DailyMAE: Towards Pretraining Masked Autoencoders in One Day." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/wu2024eccvw-dailymae/) doi:10.1007/978-3-031-91979-4_12

BibTeX

@inproceedings{wu2024eccvw-dailymae,
  title     = {{DailyMAE: Towards Pretraining Masked Autoencoders in One Day}},
  author    = {Wu, Jiantao and Mo, Shentong and Atito, Sara and Feng, Zhenhua and Kittler, Josef and Awais, Muhammad},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {131-149},
  doi       = {10.1007/978-3-031-91979-4_12},
  url       = {https://mlanthology.org/eccvw/2024/wu2024eccvw-dailymae/}
}