Abstract: We present the results of an exhaustive performance analysis of the CGYRO code on 4 leadership systems spanning 5 different configurations (2 KNL-based, 1 Skylake-based, and 2 hybrid CPU-GPU architectures). CGYRO is an Eulerian gyrokinetic solver designed and optimized for collisional, electromagnetic, multiscale fusion plasma simulation. It is based on the well-known GYRO code, but redesigned from the ground up to operate efficiently on multicore and GPU-accelerated systems. The gyrokinetic equations specify a 5-dimensional distribution function for each species, with species coupled through both the Maxwell equations and collision operator. For the cross-machine performance analysis, we report and compare timings for 4 computational and 4 communication kernels. This kernel-based breakdown illustrates the strengths and weaknesses of the floating-point and communication architectures of the respective systems. An overview of the physical equations solved, the scalable numerical methods used, and data communication patterns required by each kernel are also given.
Best Poster Finalist (BP): no
Poster summary: PDF
Reproducibility Description Appendix: PDF
Back to Poster Archive Listing