<span class="var-sub_title">Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Fifth Workshop on Accelerator Programming Using Directives (WACCPD)


Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives

Abstract: The latest production version of the fusion particle simulation code, Gyrokinetic Toroidal Code (GTC), has been ported to and optimized for the next generation exascale GPU supercomputing platform. Heterogeneous programming using directives has been utilized to fuse and thus balance the continuously implemented physical capabilities and rapidly evolving software/hardware systems. The original code has been refactored to a set of unified functions/calls to enable the acceleration for all the species of particles. Binning and GPU texture caching technique have also been used to boost the performance of the particle push and shift operations. In order to identify the hotspots, the GPU version of the GTC code was the first benchmarked on up to 8000 nodes of the Titan supercomputer, which shows about 2–3 times overall speedup comparing NVidia M2050 GPUs to Intel Xeon X5670 CPUs. This Phase I optimization was followed by further optimizations in Phase II, where single-node tests show an overall speedup of about 34 times on SummitDev and 7.9 times on Titan. The real physics tests on Summit machine showed impressive scaling properties that reaches roughly 50% efficiency on 928 nodes of Summit. The GPU+CPU speed up from purely CPU is over 20 times, leading to an unparalleled speed.

Archive Materials


Back to Fifth Workshop on Accelerator Programming Using Directives (WACCPD) Archive Listing

Back to Full Workshop Archive Listing