Beyond Gradient Descent for Regularized Segmentation Losses

Abstract

The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a "smoother" tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in "shallow" segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods.

Cite

Text

Marin et al. "Beyond Gradient Descent for Regularized Segmentation Losses." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.01043

Markdown

[Marin et al. "Beyond Gradient Descent for Regularized Segmentation Losses." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/marin2019cvpr-beyond/) doi:10.1109/CVPR.2019.01043

BibTeX

@inproceedings{marin2019cvpr-beyond,
  title     = {{Beyond Gradient Descent for Regularized Segmentation Losses}},
  author    = {Marin, Dmitrii and Tang, Meng and Ayed, Ismail Ben and Boykov, Yuri},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.01043},
  url       = {https://mlanthology.org/cvpr/2019/marin2019cvpr-beyond/}
}