Lightweight Transformer for Multi-Modal Object Detection (Student Abstract)
Abstract
It has become a common practice for many perceptual systems to integrate information from multiple sensors to improve the accuracy of object detection. For example, autonomous vehicles use visible light, and infrared (IR) information to ensure that the car can cope with complex weather conditions. However, the accuracy of the algorithm is usually a trade-off between the computational complexity and memory consumption. In this study, we evaluate the performance and complexity of different fusion operators in multi-modal object detection tasks. On top of that, a Poolformer-based fusion operator (PoolFuser) is proposed to enhance the accuracy of detecting targets without compromising the efficiency of the detection framework.
Cite
Text
Cao et al. "Lightweight Transformer for Multi-Modal Object Detection (Student Abstract)." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I13.26946Markdown
[Cao et al. "Lightweight Transformer for Multi-Modal Object Detection (Student Abstract)." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/cao2023aaai-lightweight/) doi:10.1609/AAAI.V37I13.26946BibTeX
@inproceedings{cao2023aaai-lightweight,
title = {{Lightweight Transformer for Multi-Modal Object Detection (Student Abstract)}},
author = {Cao, Yue and Fan, Yanshuo and Bin, Junchi and Liu, Zheng},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {16172-16173},
doi = {10.1609/AAAI.V37I13.26946},
url = {https://mlanthology.org/aaai/2023/cao2023aaai-lightweight/}
}