Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video

Jain, Samvit; Wang, Xin; Gonzalez, Joseph E.

doi:10.1109/CVPR.2019.00907

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video

Samvit Jain, Xin Wang, Joseph E. Gonzalez

CVPR 2019

doi:10.1109/CVPR.2019.00907 /cvpr/2019/jain2019cvpr-accel/

Abstract

We present Accel, a novel semantic video segmentation system that achieves high accuracy at low inference cost by combining the predictions of two network branches: (1) a reference branch that extracts high-detail features on a reference keyframe, and warps these features forward using frame-to-frame optical flow estimates, and (2) an update branch that computes features of adjustable quality on the current frame, performing a temporal update at each video frame. The modularity of the update branch, where feature subnetworks of varying layer depth can be inserted (e.g. ResNet-18 to ResNet-101), enables operation over a new, state-of-the-art accuracy-throughput trade-off spectrum. Over this curve, Accel models achieve both higher accuracy and faster inference times than the closest comparable single-frame segmentation networks. In general, Accel significantly outperforms previous work on efficient semantic video segmentation, correcting warping-related error that compounds on datasets with complex dynamics. Accel is end-to-end trainable and highly modular: the reference network, the optical flow network, and the update network can each be selected independently, depending on application requirements, and then jointly fine-tuned. The result is a robust, general system for fast, high-accuracy semantic segmentation on video.

PDF CVPR Semantic Scholar

Cite

Text

Jain et al. "Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00907

Markdown

[Jain et al. "Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/jain2019cvpr-accel/) doi:10.1109/CVPR.2019.00907

BibTeX

@inproceedings{jain2019cvpr-accel,
  title     = {{Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video}},
  author    = {Jain, Samvit and Wang, Xin and Gonzalez, Joseph E.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00907},
  url       = {https://mlanthology.org/cvpr/2019/jain2019cvpr-accel/}
}