Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Abstract

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

Cite

Text

Niu et al. "Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/778

Markdown

[Niu et al. "Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/niu2020ijcai-real/) doi:10.24963/IJCAI.2020/778

BibTeX

@inproceedings{niu2020ijcai-real,
  title     = {{Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization}},
  author    = {Niu, Wei and Zhao, Pu and Zhan, Zheng and Lin, Xue and Wang, Yanzhi and Ren, Bin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {5306-5308},
  doi       = {10.24963/IJCAI.2020/778},
  url       = {https://mlanthology.org/ijcai/2020/niu2020ijcai-real/}
}