Multi-GPU Training of ConvNets

Abstract

In this work, we consider a standard architecture [1] trained on the Imagenet dataset [2] for classification and investigate methods to speed convergence by parallelizing training across multiple GPUs. In this work, we used up to 4 NVIDIA TITAN GPUs with 6GB of RAM. While our experiments are performed on a single server, our GPUs have disjoint memory spaces, and just as in the distributed setting, communication overheads are an important consideration. Unlike previous work [9, 10, 11], we do not aim to improve the underlying optimization algorithm. Instead, we isolate the impact of parallelism, while using standard supervised back-propagation and synchronous mini-batch stochastic gradient descent.

Cite

Text

Yadan et al. "Multi-GPU Training of ConvNets." International Conference on Learning Representations, 2014.

Markdown

[Yadan et al. "Multi-GPU Training of ConvNets." International Conference on Learning Representations, 2014.](https://mlanthology.org/iclr/2014/yadan2014iclr-multi/)

BibTeX

@inproceedings{yadan2014iclr-multi,
  title     = {{Multi-GPU Training of ConvNets}},
  author    = {Yadan, Omry and Adams, Keith and Taigman, Yaniv and Ranzato, Marc'Aurelio},
  booktitle = {International Conference on Learning Representations},
  year      = {2014},
  url       = {https://mlanthology.org/iclr/2014/yadan2014iclr-multi/}
}