Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Abstract

Neural networks in real-world applications have to satisfy critical properties such as safety and reliability. The analysis of such properties typically requires extracting information through computing pre-images of the network transformations, but it is well-known that explicit computation of pre-images is intractable. We introduce new methods for computing compact symbolic abstractions of pre-images by computing their overapproximations and underapproximations through all layers. The abstraction of pre-images enables formal analysis and knowledge extraction without affecting standard learning algorithms. We use inverse abstractions to automatically extract simple control laws and compact representations for pre-images corresponding to unsafe outputs. We illustrate that the extracted abstractions are interpretable and can be used for analyzing complex properties.

Cite

Text

Dathathri et al. "Inverse Abstraction of Neural Networks Using Symbolic Interpolation." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33013437

Markdown

[Dathathri et al. "Inverse Abstraction of Neural Networks Using Symbolic Interpolation." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/dathathri2019aaai-inverse/) doi:10.1609/AAAI.V33I01.33013437

BibTeX

@inproceedings{dathathri2019aaai-inverse,
  title     = {{Inverse Abstraction of Neural Networks Using Symbolic Interpolation}},
  author    = {Dathathri, Sumanth and Gao, Sicun and Murray, Richard M.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {3437-3444},
  doi       = {10.1609/AAAI.V33I01.33013437},
  url       = {https://mlanthology.org/aaai/2019/dathathri2019aaai-inverse/}
}