Real-Time Breast Lesion Detection in Videos via Spatial-Temporal Feature Aggregation
Abstract
Recently, transformer-based detectors have shown impressive performance for breast lesion detection in ultrasound videos. However, these methods often require substantial computational resource and ex- hibit low inference speed, which poses challenges towards real-time ap- plicability. To address this issue, we introduce a fast yet accurate spatial- temporal transformer, named FA-DETR, to efficiently aggregate multi- scale spatial-temporal features for breast lesion detection in ultrasound videos. Our FA-DETR is based on a lightweight spatial-temporal self- attention module, which seamlessly fuses spatial and temporal features extracted from each video frame. In the decoding phase, we employ IoU- aware query selection to generate independent queries for each frame. These queries gain access to rich spatial-temporal information through the encoder embeddings’ cross-attention and frame-aware cross-attention mechanisms. Experiments conducted on a public breast lesion ultrasound video dataset demonstrate that our FA-DETR achieves state-of-the-art performance with an absolute gain of 3.8% in terms of overall AP while being 2.5 times faster, compared to the best existing approach in the literature. Our code and models will be publicly released.
Cite
Text
Qin et al. "Real-Time Breast Lesion Detection in Videos via Spatial-Temporal Feature Aggregation." Medical Imaging with Deep Learning, 2025.Markdown
[Qin et al. "Real-Time Breast Lesion Detection in Videos via Spatial-Temporal Feature Aggregation." Medical Imaging with Deep Learning, 2025.](https://mlanthology.org/midl/2025/qin2025midl-realtime/)BibTeX
@inproceedings{qin2025midl-realtime,
title = {{Real-Time Breast Lesion Detection in Videos via Spatial-Temporal Feature Aggregation}},
author = {Qin, Chao and Cao, Jiale and Khan, Fahad Shahbaz and Khan, Salman and Fu, Huazhu and Ahissar, Ehud and Anwer, Rao Muhammad},
booktitle = {Medical Imaging with Deep Learning},
year = {2025},
url = {https://mlanthology.org/midl/2025/qin2025midl-realtime/}
}