Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

Abstract

The problem of reducing processing time of large deep learning models is a fundamental challenge in many real-world applications. Early exit methods strive towards this goal by attaching additional Internal Classifiers (ICs) to intermediate layers of a neural network. ICs can quickly return predictions for easy examples and, as a result, reduce the average inference time of the whole model. However, if a particular IC does not decide to return an answer early, its predictions are discarded, with its computations effectively being wasted. To solve this issue, we introduce Zero Time Waste (ZTW), a novel approach in which each IC reuses predictions returned by its predecessors by (1) adding direct connections between ICs and (2) combining previous outputs in an ensemble-like manner. We conduct extensive experiments across various datasets and architectures to demonstrate that ZTW achieves a significantly better accuracy vs. inference time trade-off than other recently proposed early exit methods.

Cite

Text

Wołczyk et al. "Zero Time Waste: Recycling Predictions in Early Exit Neural Networks." Neural Information Processing Systems, 2021.

Markdown

[Wołczyk et al. "Zero Time Waste: Recycling Predictions in Early Exit Neural Networks." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/woczyk2021neurips-zero/)

BibTeX

@inproceedings{woczyk2021neurips-zero,
  title     = {{Zero Time Waste: Recycling Predictions in Early Exit Neural Networks}},
  author    = {Wołczyk, Maciej and Wójcik, Bartosz and Bałazy, Klaudia and Podolak, Igor T and Tabor, Jacek and Śmieja, Marek and Trzcinski, Tomasz},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/woczyk2021neurips-zero/}
}