RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images
Abstract
In this study, we propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set compared to other methods. RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.
Cite
Text
Li et al. "RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91578-9_9Markdown
[Li et al. "RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/li2024eccvw-rgmim/) doi:10.1007/978-3-031-91578-9_9BibTeX
@inproceedings{li2024eccvw-rgmim,
title = {{RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images}},
author = {Li, Guang and Togo, Ren and Ogawa, Takahiro and Haseyama, Miki},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {148-157},
doi = {10.1007/978-3-031-91578-9_9},
url = {https://mlanthology.org/eccvw/2024/li2024eccvw-rgmim/}
}