ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy

Abstract

Deriving insights from experimentally generated datasets requires methods that can account for random and systematic measurement errors and remove them in order to accurately represent the underlying effects of the conditions being tested. Here we present a framework for pretraining on large-scale microscopy datasets that includes three steps: (1) curating a set of diverse and self-consistent training samples, (2) scaling training of an appropriate foundation model architecture on this dataset, (3) evaluating intermediate layers of the trained model to identify the best representation for downstream tasks. Using this strategy, we present the largest foundation model for cell microscopy data to our knowledge, a new 1.9 billion-parameter ViT-G/8 MAE trained on over 8 billion microscopy image crops. Compared to a previous published ViT-L/8 MAE, our new model achieves a 60% improvement in linear separability of genetic perturbations and obtains the best overall performance on whole-genome relationship recall, batch correction replicate consistency, and compound-gene activity prediction benchmarks.

Cite

Text

Kenyon-Dean et al. "ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Kenyon-Dean et al. "ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/kenyondean2025icml-vitally/)

BibTeX

@inproceedings{kenyondean2025icml-vitally,
  title     = {{ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy}},
  author    = {Kenyon-Dean, Kian and Wang, Zitong Jerry and Urbanik, John and Donhauser, Konstantin and Hartford, Jason and Saberian, Saber and Sahin, Nil and Bendidi, Ihab and Celik, Safiye and Vera, Juan Sebastián Rodrı́guez and Fay, Marta and Haque, Imran S and Kraus, Oren},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {29735-29752},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/kenyondean2025icml-vitally/}
}