Kernel-Based and Total Performance Analysis of CGYRO on 4 Leadership Systems
TimeThursday, November 15th8:30am - 5pm
DescriptionWe present the results of an exhaustive performance analysis of the CGYRO code on 4 leadership systems spanning 5 different configurations (2 KNL-based, 1 Skylake-based, and 2 hybrid CPU-GPU architectures). CGYRO is an Eulerian gyrokinetic solver designed and optimized for collisional, electromagnetic, multiscale fusion plasma simulation. It is based on the well-known GYRO code, but redesigned from the ground up to operate efficiently on multicore and GPU-accelerated systems. The gyrokinetic equations specify a 5-dimensional distribution function for each species, with species coupled through both the Maxwell equations and collision operator. For the cross-machine performance analysis, we report and compare timings for 4 computational and 4 communication kernels. This kernel-based breakdown illustrates the strengths and weaknesses of the floating-point and communication architectures of the respective systems. An overview of the physical equations solved, the scalable numerical methods used, and data communication patterns required by each kernel are also given.