BlazingText: Scaling and Accelerating Word2Vec using
Multiple GPUs
Author/Presenter
Event Type
Workshop
Deep Learning
Machine Learning
SIGHPC Workshop
TimeMonday, November 13th3:54pm -
4:18pm
Location502-503-504
DescriptionWord2Vec is a popular algorithm used for generating
dense vector representations of words in large corpora
using unsupervised learning. The resulting vectors have
been shown to capture semantic relationships between the
corresponding words and are used extensively for many
downstream natural language processing (NLP) tasks like
sentiment analysis, named entity recognition and machine
translation. Most open-source implementations of the
algorithm have been parallelized for multi-core CPU
architectures including the original C implementation by
Mikolov et al. and FastText by Facebook. A few other
implementations have attempted to leverage GPU
parallelization but at the cost of accuracy and
scalability. In this work, we present BlazingText, a
highly optimized implementation of Word2Vec in CUDA,
that can leverage multiple GPUs for training.
BlazingText can achieve a training speed of up to 43M
words/sec on 8 GPUs, which is a 9x speedup over
8-threaded CPU implementations, with minimal effect on
the quality of the embeddings.
Author/Presenter




