With the increasing prevalence of multicore processors, shared-memory programming models are essential. OpenMP is a popular, portable, widely supported and easy-to-use shared-memory model. Developers usually find OpenMP easy to learn. They are often disappointed, however, with the performance and scalability of the resulting code, which is due not to shortcomings of OpenMP but rather a lack of depth with which it is employed. Our Advanced OpenMP Programming tutorial addresses this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance.
We assume attendees understand basic parallelization and OpenMP concepts. We focus on performance aspects, e.g., data/thread locality, false sharing and exploitation of vector units. All topics are accompanied with case studies and we discuss the corresponding language features in-depth. Focus is solely on performance programming for multi-core architectures. Throughout all topics, we present the recent additions of OpenMP 5.0.