Exploring Application Performance on Fat-Tree Networks in the Presence of Congestion
Event Type
Registration Categories
TimeThursday, November 15th8:30am - 5pm
DescriptionNetwork congestion, which occurs when multiple applications simultaneously use shared links in cluster network, can cause poor communication performance, decreasing the performance and scalability of parallel applications. Many studies are performed while clusters also run other production workloads, which makes it harder for them to isolate causes and their effects. To look at congestion in a more controlled setting we used dedicated access time on an HPC cluster and measured the performance of three HPC applications with different communication patterns run with varying amounts and types of background traffic. This enables us to assess the relative sensitivity of the applications to congestion caused by different traffic patterns. Our tests show that the applications were not significantly impacted by even the most aggressive neighboring patterns, with all the performance degradation being 7% or less, pointing to the resiliency of the fat-tree topology.
Back To Top Button