<span class="var-sub_title">Exploring Application Performance on Fat-Tree Networks in the Presence of Congestion</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Exploring Application Performance on Fat-Tree Networks in the Presence of Congestion

Authors: Philip A. Taffet (Rice University, Lawrence Livermore National Laboratory), Sanil Rao (University of Virginia, Lawrence Livermore National Laboratory), Ian Karlin (Lawrence Livermore National Laboratory)

Abstract: Network congestion, which occurs when multiple applications simultaneously use shared links in cluster network, can cause poor communication performance, decreasing the performance and scalability of parallel applications. Many studies are performed while clusters also run other production workloads, which makes it harder for them to isolate causes and their effects. To look at congestion in a more controlled setting we used dedicated access time on an HPC cluster and measured the performance of three HPC applications with different communication patterns run with varying amounts and types of background traffic. This enables us to assess the relative sensitivity of the applications to congestion caused by different traffic patterns. Our tests show that the applications were not significantly impacted by even the most aggressive neighboring patterns, with all the performance degradation being 7% or less, pointing to the resiliency of the fat-tree topology.

Best Poster Finalist (BP): no

Poster: pdf
Poster summary: PDF
Reproducibility Description Appendix: PDF

Back to Poster Archive Listing