Authors: Michael Heroux (Sandia National Laboratories, St. John’s University), Jack Dongarra (University of Tennessee), Piotr Luszczek (University of Tennessee)
Abstract: The High Performance Conjugate Gradients (HPCG) Benchmark is a community metric for ranking high performance computing systems, officially part of the TOP500 and a companion to the LINPACK benchmark.
In this BOF we first present an update of HPCG policies and opportunities for optimizing performance and follow with presentations from vendors who have participated in HPCG optimization efforts, in particular efforts for ARM-based systems, Nvidia, Intel and IBM. We spend the remaining time in open discussion about the future of HPCG design and implementation strategies for further improvements.
Long Description: The High Performance Conjugate Gradients (HPCG) Benchmark is designed to test features of a high performance computing (HPC) system in a way that complements the high performance Linpack (HPL) benchmark. HPL tends to approach the maximum achievable floating point performance on a given system, while most real applications reach a tiny fraction of what HPL achieves. In contrast, HPCG achieves a much smaller fraction of peak performance and instead tests interconnect latency and bandwidth of the interconnect network, the manycore/accelerator nodes and individual processors.
HPL derives its performance from an existing collection of optimized dense matrix kernels. As a result, achieving good performance from HPL is fairly straightforward. HPCG is a newer benchmark that depends primarily on sparse linear algebra and optimization strategies and implementations are still emerging. Even so, major computing system vendors such as Arm, Fujitsu, IBM, Intel, Nvidia and their related leadership computing facilities have made significant investments in HPCG optimization. In addition, major Chinese national supercomputing centers in Tianjin and Wuxi have also committed significant resources to optimizing HPCG for their systems.
HPCG has been officially part of the TOP500 effort since ISC2017 in June 2017. Since then, it has provided an important counterpoint to the conversations in the HPC that tend to highlight ambitious efforts to achieve ever-increasing performance rates for the LINPACK benchmark.
In this BOF we review the architecture of the HPCG reference code, emphasizing opportunities for improving its performance and describing what kinds of optimizations are permissible, especially for the latest version HPCG 3.0. We follow this with a presentation from each of the computer system vendors on how they have optimized HPCG for their systems. Confirmed presenters are from Arm, Fujitsu, Intel and Nvidia. Each vendor will discuss recent efforts to improve HPCG performance on their latest platform. We conclude the BOF with a general discussion about HPCG, future plans for its design and implementation and questions.
This BOF will be valuable to any HPC community member who is interested in benchmarking and how HPCG can be used to obtain understanding about large-scale system performance.
Back to Birds of a Feather Archive Listing