Inverse Abstraction of Neural Networks Using Symbolic Interpolation
Abstract
Neural networks in real-world applications have to satisfy critical properties such as safety and reliability. The analysis of such properties typically requires extracting information through computing pre-images of the network transformations, but it is well-known that explicit computation of pre-images is intractable. We introduce new methods for computing compact symbolic abstractions of pre-images by computing their overapproximations and underapproximations through all layers. The abstraction of pre-images enables formal analysis and knowledge extraction without affecting standard learning algorithms. We use inverse abstractions to automatically extract simple control laws and compact representations for pre-images corresponding to unsafe outputs. We illustrate that the extracted abstractions are interpretable and can be used for analyzing complex properties.
Cite
Text
Dathathri et al. "Inverse Abstraction of Neural Networks Using Symbolic Interpolation." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33013437Markdown
[Dathathri et al. "Inverse Abstraction of Neural Networks Using Symbolic Interpolation." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/dathathri2019aaai-inverse/) doi:10.1609/AAAI.V33I01.33013437BibTeX
@inproceedings{dathathri2019aaai-inverse,
title = {{Inverse Abstraction of Neural Networks Using Symbolic Interpolation}},
author = {Dathathri, Sumanth and Gao, Sicun and Murray, Richard M.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2019},
pages = {3437-3444},
doi = {10.1609/AAAI.V33I01.33013437},
url = {https://mlanthology.org/aaai/2019/dathathri2019aaai-inverse/}
}