<span class="var-sub_title">Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters


Authors: Samuel D. Pollard (University of Oregon), Nikhil Jain (Lawrence Livermore National Laboratory), Stephen Herbein (Lawrence Livermore National Laboratory), Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract: Interference between jobs competing for network bandwidth on a fat-tree cluster can cause significant variability and degradation in performance. These performance issues can be mitigated or completely eliminated if the resource allocation policy takes the network topology into account when allocating nodes to jobs. We implement a fat-tree network topology aware node allocation policy that allocates isolated partitions to jobs in order to eliminate inter-job interference. We compare the impact of this node allocation policy to a topology-oblivious policy with respect to the execution time of individual jobs with different communication patterns. We also evaluate the cluster's quality of service using metrics such as system utilization, schedule makespan, and job wait time for both policies. The results obtained for production workloads indicate that a topology-aware node allocation can provide interference-free execution without negatively impacting the cluster's quality of service.




Back to Technical Papers Archive Listing