A Systematic Framework for Natural Perturbations from Videos
Abstract
We introduce a systematic framework for quantifying the robustness of classifiers to naturally occurring perturbations of images found in videos. As part of this framework, we construct ImageNet-Vid-Robust, a human-expert--reviewed dataset of 22,668 images grouped into 1,145 sets of perceptually similar images derived from frames in the ImageNet Video Object Detection dataset. We evaluate a diverse array of classifiers trained on ImageNet, including models trained for robustness, and show a median classification accuracy drop of 16\%. Additionally, we evaluate the Faster R-CNN and R-FCN models for detection, and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points. Our analysis shows that natural perturbations in the real world are heavily problematic for current CNNs, posing a significant challenge to their deployment in safety-critical environments that require reliable, low-latency predictions.
Cite
Text
Shankar et al. "A Systematic Framework for Natural Perturbations from Videos." ICML 2019 Workshops: Deep_Phenomena, 2019.Markdown
[Shankar et al. "A Systematic Framework for Natural Perturbations from Videos." ICML 2019 Workshops: Deep_Phenomena, 2019.](https://mlanthology.org/icmlw/2019/shankar2019icmlw-systematic/)BibTeX
@inproceedings{shankar2019icmlw-systematic,
title = {{A Systematic Framework for Natural Perturbations from Videos}},
author = {Shankar, Vaishaal and Dave, Achal and Roelofs, Rebecca and Ramanan, Deva and Recht, Benjamin and Schmidt, Ludwig},
booktitle = {ICML 2019 Workshops: Deep_Phenomena},
year = {2019},
url = {https://mlanthology.org/icmlw/2019/shankar2019icmlw-systematic/}
}