DDViT: Double-Level Fusion Domain Adapter Vision Transformer (Student Abstract)
Abstract
With the help of Vision transformers (ViTs), medical image segmentation was able to achieve outstanding performance. In particular, they overcome the limitation of convolutional neural networks (CNNs) which rely on local receptive fields. ViTs use self-attention mechanisms to consider relationships between all image pixels or patches simultaneously. However, they require large datasets for training and did not perform well on capturing low-level features. To that end, we propose DDViT, a novel ViT model that unites a CNN to alleviate data-hunger for medical image segmentation with two multi-scale feature representations. Significantly, our approach incorporates a ViT with a plug-in domain adapter (DA) with Double-Level Fusion (DLF) technique, complemented by a mutual knowledge distillation paradigm, facilitating the seamless exchange of knowledge between a universal network and specialized domain-specific network branches. The DLF framework plays a pivotal role in our encoder-decoder architecture, combining the innovation of the TransFuse module with a robust CNN-based encoder. Extensive experimentation across diverse medical image segmentation datasets underscores the remarkable efficacy of DDViT when compared to alternative approaches based on CNNs and Transformer-based models.
Cite
Text
Sun and Sheng. "DDViT: Double-Level Fusion Domain Adapter Vision Transformer (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I21.30516Markdown
[Sun and Sheng. "DDViT: Double-Level Fusion Domain Adapter Vision Transformer (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/sun2024aaai-ddvit/) doi:10.1609/AAAI.V38I21.30516BibTeX
@inproceedings{sun2024aaai-ddvit,
title = {{DDViT: Double-Level Fusion Domain Adapter Vision Transformer (Student Abstract)}},
author = {Sun, Linpeng and Sheng, Victor S.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {23661-23663},
doi = {10.1609/AAAI.V38I21.30516},
url = {https://mlanthology.org/aaai/2024/sun2024aaai-ddvit/}
}