Understanding Simultaneous Impact of Network QoS and Power on HPC Application Performance
Abstract: With the growing complexity of high performance computing (HPC) systems, application performance variation has increased enough to disrupt the overall throughput of the systems. Such performance variation is expected to worsen in the future, when job schedulers will have to manage flow resources such as power, network and I/O in addition to traditional resources such as nodes and memory. In this work, we study the simultaneous impact of inter-job interference, Infiniband service levels, and power capping on different applications in a controlled experimental setup, with the goal of understanding the range of performance variation as well as potential mitigation strategies.
Back to Computational Reproducibility at Exascale 2018 (CRE2018) Archive Listing
Back to Full Workshop Archive Listing