International HPC Summer School 2016
Parallel programming: accelerator track 2
John Urbanic, PRACE, XSEDE, RIKEN, Compute Canada
June 2016
Slide contents
Our Workshop Environment
Exercise 1 C Solution
Exercise 1 Fortran Solution
Exercise 1: Compiler output (C)
Exercise 1: Performance
What’s with the OpenMP?
What went wrong?
Basic Concept
Multiple Times Each Iteration
Excessive Data Transfers
Data Management
First, about that “reduction”
Data Construct Syntax and Scope
Data Clauses
Array Shaping
Compiler will (increasingly) often make a good guess…
Data Regions Have Real Consequences
Data Regions Are Different Than Compute Regions
Data Regions Have Real Consequences
Compiler will (increasingly) often make a good guess…
Data Regions Have Real Consequences
Data Regions Are Different Than Compute Regions
Compiler will (increasingly) often make a good guess…
Data Regions Have Real Consequences
Data Movement Decisions
Exercise 2: Use acc data to minimize transfers
Data Movement Decisions
Exercise 2: Use acc data to minimize transfers
Exercise 2 C Solution
Exercise 2 Fortran Solution
Exercise 2 C Solution
Array Shaping
Exercise 2 Fortran Solution
Exercise 2: Performance
OpenACC or OpenMP?
OpenACC or OpenMP on Larger Data?
Latest Happenings In Data Management
Further speedups
General Principles: Finding Parallelism In Code
Is OpenACC Living Up To My Claims?
Is OpenACC Living Up To My Claims?
In Conclusion…