Content-Aware Input Scaling and Deep Learning Computation Offloading for Low-Latency Embedded Vision
Abstract
Deploying deep learning (DL) models for visual recognition on embedded systems is often constrained by their limited compute power and storage capacity, and has stringent latency and power requirements. As emerging DL applications continue to evolve, they place increasing demands on computational resources that embedded vision systems are unable to provision. One promising solution to overcome these limitations is computation offloading. However, for performance improvements to be realized, it is essential to carefully partition tasks, taking into account both the quality of the data and the communication overhead.In this paper, we introduce a novel framework for content-aware offloading of DL computations, aimed at maximizing quality-of-service while adhering to latency constraints. Our proposed framework involves the embedded vision system/edge device intelligently compressing data in a content-aware manner using a lightweight model and transmitting it to a more powerful server. The framework consists of two key components: offline training for efficient content-aware data scaling and online control that adapts to the network variations in real-time. To illustrate the effectiveness of our approach, we apply it to multiple downstream tasks such as face identification, person keypoint detection, and instance segmentation, showcasing a significant enhancement in the overall quality of results for various applications.
Cite
Text
Prabhune et al. "Content-Aware Input Scaling and Deep Learning Computation Offloading for Low-Latency Embedded Vision." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00227Markdown
[Prabhune et al. "Content-Aware Input Scaling and Deep Learning Computation Offloading for Low-Latency Embedded Vision." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/prabhune2024cvprw-contentaware/) doi:10.1109/CVPRW63382.2024.00227BibTeX
@inproceedings{prabhune2024cvprw-contentaware,
title = {{Content-Aware Input Scaling and Deep Learning Computation Offloading for Low-Latency Embedded Vision}},
author = {Prabhune, Omkar and Chen, Tianen and Kim, Younghyun},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {2218-2226},
doi = {10.1109/CVPRW63382.2024.00227},
url = {https://mlanthology.org/cvprw/2024/prabhune2024cvprw-contentaware/}
}