Framework for Scalable Intra-Node Collective Operations Using Shared Memory

<span class="var-sub_title">Framework for Scalable Intra-Node Collective Operations Using Shared Memory</span> SC18 Proceedings

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Authors: Surabhi Jain (Intel Corporation), Rashid Kaleem (Intel Corporation), Marc Gamell Balmana (Intel Corporation), Akhil Langer (Intel Corporation), Dmitry Durnov (Intel Corporation), Alexander Sannikov (Intel Corporation), Maria Garzaran (Intel Corporation)

Abstract: Collective operations are used in MPI programs to express common communication patterns, collective computations, or synchronizations. In many collectives, such as barrier or allreduce, the intra-node component of the collective is in the critical path, as the inter-node communication cannot start until the intra-node component has been executed. Thus, with increasing number of core counts in each node, intra-node optimizations that leverage the intra-node shared memory become increasingly important.

In this paper, we focus on the performance benefit of optimizing intra-node collectives using shared memory. We optimize several collectives using the primitives in broadcast and reduce as building blocks for other collectives. A comparison of our implementation on top of MPICH shows significant performance speedups with respect to the original MPICH implementation, MVAPICH, and OpenMPI, among others.

Presentation: file

Back to Technical Papers Archive Listing