Multi-GPU Training of ConvNets
Abstract
In this work, we consider a standard architecture [1] trained on the Imagenet dataset [2] for classification and investigate methods to speed convergence by parallelizing training across multiple GPUs. In this work, we used up to 4 NVIDIA TITAN GPUs with 6GB of RAM. While our experiments are performed on a single server, our GPUs have disjoint memory spaces, and just as in the distributed setting, communication overheads are an important consideration. Unlike previous work [9, 10, 11], we do not aim to improve the underlying optimization algorithm. Instead, we isolate the impact of parallelism, while using standard supervised back-propagation and synchronous mini-batch stochastic gradient descent.
Cite
Text
Yadan et al. "Multi-GPU Training of ConvNets." International Conference on Learning Representations, 2014.Markdown
[Yadan et al. "Multi-GPU Training of ConvNets." International Conference on Learning Representations, 2014.](https://mlanthology.org/iclr/2014/yadan2014iclr-multi/)BibTeX
@inproceedings{yadan2014iclr-multi,
title = {{Multi-GPU Training of ConvNets}},
author = {Yadan, Omry and Adams, Keith and Taigman, Yaniv and Ranzato, Marc'Aurelio},
booktitle = {International Conference on Learning Representations},
year = {2014},
url = {https://mlanthology.org/iclr/2014/yadan2014iclr-multi/}
}