<span class="var-sub_title">Compiler Optimization for Heterogeneous Locality and Homogeneous Parallelism in OpenCL and LLVM</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

LLVM-HPC2018: The Fifth Workshop on the LLVM Compiler Infrastructure in HPC


Compiler Optimization for Heterogeneous Locality and Homogeneous Parallelism in OpenCL and LLVM

Authors: Dorit Nuzman (Intel Corporation)

Abstract: Heterogeneous platforms may include accelerators such as Digital Signal Processors (DSP’s) that employ SW-controlled scratch-pad memories instead of, or in addition to standard HW-cached memory. Controlling scratch-pads efficiently typically requires tiling and pipelining loops, thereby optimizing for memory locality rather than parallelism as a primary objective. On the other hand, achieving high performance on CPU’s and GPU’s typically requires optimizing for data-level parallelism as a primary objective, compromising locality. In this lightning talk, we show how OpenCL and LLVM can be used to achieve both target-dependent locality and target-independent parallelism. Such an approach facilitates the development of optimized software for DSP accelerators while enabling its efficient execution on standard servers. Following the work of Tian et al., our approach leverages automatic compiler optimization and relies purely on OpenCL, including its device-side enqueue capability and SPIR-V format.

Archive Materials


Back to LLVM-HPC2018: The Fifth Workshop on the LLVM Compiler Infrastructure in HPC Archive Listing

Back to Full Workshop Archive Listing