A19: Performance Analysis of a Parallelized Restricted
Boltzmann Machine Artificial Neural Network Using OpenACC
Framework and TAU Profiling System
SessionPoster Reception
Author
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionRestricted Boltzmann Machines are stochastic neural
networks that create probability distribution based off
connection weight between nodes of the hidden and
visible layer. The distribution makes the program
optimal at classifying large amounts of data, which
could be useful in work settings, such as a research
lab. The parallelization of these neural networks would
allow for the classification of data at a much faster
rate than before. Using a high-performance computer it
was determined that parallelizing the neural networks
could decrease the runtime of the algorithm by over 35%
when offloading the work to a GPU through OpenACC. Using
Tuning and Analysis Utilities Profiling Systems, it was
found that scheduling the program would only be
effective if the data size was large enough and an
increase in the number of thread blocks used for
scheduling would allow for greater performance gains
than the number of threads in each thread block.




