Multi-Resolution Rescored ByteTrack for Video Object Detection on Ultra-Low-Power Embedded Systems

Abstract

This paper introduces Multi-Resolution Rescored ByteTrack (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25× by alternating the processing of high-resolution images (320 × 320 pixels) with multiple down-sized frames (192×192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi_Resolution_Rescored_ByteTrack

Cite

Text

Bompani et al. "Multi-Resolution Rescored ByteTrack for Video Object Detection on Ultra-Low-Power Embedded Systems." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00223

Markdown

[Bompani et al. "Multi-Resolution Rescored ByteTrack for Video Object Detection on Ultra-Low-Power Embedded Systems." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/bompani2024cvprw-multiresolution/) doi:10.1109/CVPRW63382.2024.00223

BibTeX

@inproceedings{bompani2024cvprw-multiresolution,
  title     = {{Multi-Resolution Rescored ByteTrack for Video Object Detection on Ultra-Low-Power Embedded Systems}},
  author    = {Bompani, Luca and Rusci, Manuele and Palossi, Daniele and Conti, Francesco and Benini, Luca},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {2182-2190},
  doi       = {10.1109/CVPRW63382.2024.00223},
  url       = {https://mlanthology.org/cvprw/2024/bompani2024cvprw-multiresolution/}
}