Patch Gradient Descent: Training Neural Networks on Very Large Images

Abstract

Current deep learning models falter when faced with large-scale images, largely due to prohibitive computing and memory demands. Enter Patch Gradient Descent (PatchGD), a groundbreaking learning technique that seamlessly trains deep learning models on expansive images. This innovation takes inspiration from the standard feedforward-backpropagation paradigm. However, instead of processing an entire image simultaneously, PatchGD smartly segments and updates a core information-gathering element using portions of the image before the final evaluation. This ensures wide coverage across iterations, bringing in notable memory and computational efficiencies. When tested on the high-resolution PANDA and UltraMNIST datasets using ResNet50 and MobileNetV2 models, PatchGD clearly outstrips traditional gradient descent techniques, particularly under memory constraints. The future of handling vast image datasets effectively lies with PatchGD.

Cite

Text

Gupta et al. "Patch Gradient Descent: Training Neural Networks on Very Large Images." NeurIPS 2023 Workshops: WANT, 2023.

Markdown

[Gupta et al. "Patch Gradient Descent: Training Neural Networks on Very Large Images." NeurIPS 2023 Workshops: WANT, 2023.](https://mlanthology.org/neuripsw/2023/gupta2023neuripsw-patch/)

BibTeX

@inproceedings{gupta2023neuripsw-patch,
  title     = {{Patch Gradient Descent: Training Neural Networks on Very Large Images}},
  author    = {Gupta, Deepak and Mago, Gowreesh and Chavan, Arnav and Prasad, Dilip and Thomas, Rajat Mani},
  booktitle = {NeurIPS 2023 Workshops: WANT},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/gupta2023neuripsw-patch/}
}