A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices
Abstract
The rapid development and wide utilization of object detection techniques have aroused requirements for both accuracy and speed of object detectors. In this work, we propose a compression-compilation co-design framework to achieve real-time YOLOv4 inference on mobile devices. We propose a novel fine-grained structured pruning, which maintain high accuracy while achieving high hardware parallelism. Our pruned YOLOv4 achieves 48.9 mAP and 17 FPS inference speed on an off-the-shelf Samsung Galaxy S20 smartphone, which is 5.5x faster than the original state-of-the-art detector YOLOv4.
Cite
Text
Cai et al. "A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I18.17992Markdown
[Cai et al. "A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/cai2021aaai-compression/) doi:10.1609/AAAI.V35I18.17992BibTeX
@inproceedings{cai2021aaai-compression,
title = {{A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices}},
author = {Cai, Yuxuan and Yuan, Geng and Li, Hongjia and Niu, Wei and Li, Yanyu and Tang, Xulong and Ren, Bin and Wang, Yanzhi},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {15997-16000},
doi = {10.1609/AAAI.V35I18.17992},
url = {https://mlanthology.org/aaai/2021/cai2021aaai-compression/}
}