Bridging Protein Sequences and Microscopy Images with Unified Diffusion Models
Abstract
Fluorescence microscopy is ubiquitously used in cell biology research to characterize the cellular role of a protein. To help elucidate the relationship between the amino acid sequence of a protein and its cellular function, we introduce CELL-Diff, a unified diffusion model facilitating bidirectional transformations between protein sequences and their corresponding microscopy images. Utilizing reference cell morphology images and a protein sequence, CELL-Diff efficiently generates corresponding protein images. Conversely, given a protein image, the model outputs protein sequences. CELL-Diff integrates continuous and diffusion models within a unified framework and is implemented using a transformer-based network. We train CELL-Diff on the Human Protein Atlas (HPA) dataset and fine-tune it on the OpenCell dataset. Experimental results demonstrate that CELL-Diff outperforms existing methods in generating high-fidelity protein images, making it a practical tool for investigating subcellular protein localization and interactions.
Cite
Text
Zheng and Huang. "Bridging Protein Sequences and Microscopy Images with Unified Diffusion Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Zheng and Huang. "Bridging Protein Sequences and Microscopy Images with Unified Diffusion Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zheng2025icml-bridging/)BibTeX
@inproceedings{zheng2025icml-bridging,
title = {{Bridging Protein Sequences and Microscopy Images with Unified Diffusion Models}},
author = {Zheng, Dihan and Huang, Bo},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {78139-78155},
volume = {267},
url = {https://mlanthology.org/icml/2025/zheng2025icml-bridging/}
}