Modeling Large Compute Nodes with Heterogeneous Memories
with the Cache-Aware Roofline Model
Author/Presenters
Event Type
Workshop
Accelerators
Benchmarks
Compiler Analysis and Optimization
Deep Learning
Effective Application of HPC
Energy
Exascale
GPU
I/O
Parallel Application Frameworks
Parallel Programming Languages, Libraries, Models
and Notations
Performance
Simulation
Storage
TimeMonday, November 13th3:30pm -
4pm
Location704-706
DescriptionIn order to fulfill modern applications needs,
computing systems become more powerful, heterogeneous,
and complex. NUMA platforms and emerging high bandwidth
memories offer new opportunities for performance
improvements. However, they also increase hardware and
software complexity, thus making application performance
analysis and optimization an even harder task. The
Cache-Aware Roofline Model (CARM) is an insightful, yet
simple model designed to address this issue. It provides
feedback on potential applications bottlenecks and shows
how far is the application performance from the
achievable hardware upper-bounds. However, it does not
encompass NUMA systems and next generation processors
with heterogeneous memories. Yet, some application
bottlenecks belong to those memory subsystems and would
benefit from the CARM insights. In this presentation, we
fill the missing requirements to scope recent large
shared memory systems with the CARM. We provide the
methodology to instantiate, and validate the model on a
NUMA system as well as on the latest Xeon Phi processor
equipped with configurable hybrid memory. Finally, we
show the model ability to exhibits several bottlenecks
of such systems, which were not supported by CARM.
Author/Presenters




