Differentiable Learning of Submodular Models
Abstract
Can we incorporate discrete optimization algorithms within modern machine learning models? For example, is it possible to use in deep architectures a layer whose output is the minimal cut of a parametrized graph? Given that these models are trained end-to-end by leveraging gradient information, the introduction of such layers seems very challenging due to their non-continuous output. In this paper we focus on the problem of submodular minimization, for which we show that such layers are indeed possible. The key idea is that we can continuously relax the output without sacrificing guarantees. We provide an easily computable approximation to the Jacobian complemented with a complete theoretical analysis. Finally, these contributions let us experimentally learn probabilistic log-supermodular models via a bi-level variational inference formulation.
Cite
Text
Djolonga and Krause. "Differentiable Learning of Submodular Models." Neural Information Processing Systems, 2017.Markdown
[Djolonga and Krause. "Differentiable Learning of Submodular Models." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/djolonga2017neurips-differentiable/)BibTeX
@inproceedings{djolonga2017neurips-differentiable,
title = {{Differentiable Learning of Submodular Models}},
author = {Djolonga, Josip and Krause, Andreas},
booktitle = {Neural Information Processing Systems},
year = {2017},
pages = {1013-1023},
url = {https://mlanthology.org/neurips/2017/djolonga2017neurips-differentiable/}
}