Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks

Abstract

When engineers train deep learning models, they are very much "flying blind". Commonly used methods for real-time training diagnostics, such as monitoring the train/test loss, are limited. Assessing a network's training process solely through these performance indicators is akin to debugging software without access to internal states through a debugger. To address this, we present Cockpit, a collection of instruments that enable a closer look into the inner workings of a learning machine, and a more informative and meaningful status report for practitioners. It facilitates the identification of learning phases and failure modes, like ill-chosen hyperparameters. These instruments leverage novel higher-order information about the gradient distribution and curvature, which has only recently become efficiently accessible. We believe that such a debugging tool, which we open-source for PyTorch, is a valuable help in troubleshooting the training process. By revealing new insights, it also more generally contributes to explainability and interpretability of deep nets.

Cite

Text

Schneider et al. "Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks." Neural Information Processing Systems, 2021.

Markdown

[Schneider et al. "Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/schneider2021neurips-cockpit/)

BibTeX

@inproceedings{schneider2021neurips-cockpit,
  title     = {{Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks}},
  author    = {Schneider, Frank and Dangel, Felix and Hennig, Philipp},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/schneider2021neurips-cockpit/}
}