Presentation

· Presenters · Organizations · Search Program

Workshop

: Toward Scalable Parallel Training of Deep Neural Networks

SessionMachine Learning in HPC Environments

Author/Presenters

Sam Ade Jacobs

Nikoli Dryden

Roger Pearce

Brian Van Essen

Event Type

Workshop

Tags

TimeMonday, November 13th3:30pm - 3:54pm

Location502-503-504

DescriptionWe propose a new framework for parallelizing deep neural network training that maximizes the amount of data that is ingested by the training algorithm. Our proposed framework called Livermore Tournament Fast Batch Learning (LTFB) targets large-scale data problems. The LTFB approach creates a set of Deep Neural Network (DNN) models and trains each instance of these models independently and in parallel. Periodically, each model selects another model to pair with, exchanges models, and then run a local tournament against held-out tournament datasets. The winning model continues training on the local training datasets. This new approach maximizes computation and minimizes amount of synchronization required in training deep neural network, a major bottleneck in existing synchronous deep learning algorithms. We evaluate our proposed algorithm on two HPC machines at Lawrence Livermore National Laboratory including an early access IBM Power8+ with NVIDIA Tesla P100 GPUs machine. Experimental evaluations of the LTFB framework on two popular image classification benchmark: CIFAR10 and ImageNet, show significant speed up compared to the sequential baseline.

Author/Presenters

Sam Ade Jacobs

Lawrence Livermore National Laboratory

Nikoli Dryden

University of Illinois

Lawrence Livermore National Laboratory

Roger Pearce

Lawrence Livermore National Laboratory

Brian Van Essen

Lawrence Livermore National Laboratory

Navigation