Response Time Analysis for Explainability of Visual Processing in CNNs

Abstract

Explainable artificial intelligence (XAI) methods rely on access to model architecture and parameters that is not always feasible for most users, practitioners, and regulators. Inspired by cognitive psychology, we present a case for response times (RTs) as a technique for XAI. RTs are observable without access to the model. Moreover, dynamic inference models performing conditional computation generate variable RTs for visual learning tasks depending on hierarchical representations. We show that MSDNet, a conditional computation model with early-exit architecture, exhibits slower RT for images with more complex features in the ObjectNet test set, as well as the human phenomenon of scene grammar, where object recognition depends on intrascene object-object relationships. These results cast light on MSDNet’s feature space without opening the black box and illustrate the promise of RT methods for XAI.

Cite

Text

Taylor et al. "Response Time Analysis for Explainability of Visual Processing in CNNs." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00199

Markdown

[Taylor et al. "Response Time Analysis for Explainability of Visual Processing in CNNs." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/taylor2020cvprw-response/) doi:10.1109/CVPRW50498.2020.00199

BibTeX

@inproceedings{taylor2020cvprw-response,
  title     = {{Response Time Analysis for Explainability of Visual Processing in CNNs}},
  author    = {Taylor, J. Eric and Shekhar, Shashank and Taylor, Graham W.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2020},
  pages     = {1555-1558},
  doi       = {10.1109/CVPRW50498.2020.00199},
  url       = {https://mlanthology.org/cvprw/2020/taylor2020cvprw-response/}
}