Parallel programming: accelerator track 1

  • John Urbanic, PRACE, XSEDE, RIKEN, Compute Canada
  • June 2016

Slide contents

  • Our Workshop Environment
  • Introduction to OpenACC
  • What is OpenACC?
  • Directives
  • Familiar to OpenMP Programmers
  • How Else Would We Accelerate Applications?
  • Key Advantages Of This Approach
  • A Few Cases
  • A Champion Case
  • Broad Accelerator Support
  • NVIDIA Rules
  • True Standard
  • A Simple Example: SAXPY
  • kernels: Our first OpenACC Directive
  • General Directive Syntax and Scope
  • Complete SAXPY Example Code
  • C Detail: the restrict keyword
  • Compile and Run
  • Compare: Partial CUDA C SAXPY Code
  • Compare: Partial CUDA Fortran SAXPY Code
  • Again: Complete SAXPY Example Code
  • Big Difference!
  • This looks easy! Too easy…
  • Data Dependencies
  • No Data Dependency
  • Data Dependency
  • Data Dependency
  • Data Dependencies
  • Data Dependencies
  • Our Foundation Exercise: Laplace Solver
  • Exercise Foundation: Jacobi Iteration
  • Serial Code Implementation
  • Serial C Code (kernel)
  • Serial C Code Subroutines
  • Whole C Code
  • Serial Fortran Code (kernel)
  • Serial Fortran Code Subroutines
  • Whole Fortran Code
  • Exercises: General Instructions for Compiling
  • Our Workshop Environment
  • Our Environment This Week
  • © 2015 Pittsburgh Supercomputing Center
  • Bridges Node Types
  • Getting Connected
  • Compiling
  • Editors
  • Compiling
  • Multiple Sessions
  • Our Setup For This Workshop
  • Preliminary Exercise
  • Exercises: General Instructions for Compiling
  • Exercises: Very useful compiler option
  • Exercises: General Instructions for Running
  • Exercises: Very useful compiler option
  • Exercises: General Instructions for Running
  • Exercise 1: Using kernels to parallelize the main loops