OpenMP GPU Offload in Flang and LLVM
TimeMonday, November 12th10:30am - 11am
DescriptionGraphics Processing Units (GPUs) have been widely adopted to accelerate the execution of High Performance Computing (HPC) workloads due to their enormous computational throughput, ability to execute a large number of threads inside SIMD groups in parallel, and their use of multithreaded hardware to hide long pipelining and memory access latency. However, developing applications able to exploit the high performance of GPUs requires proper code tuning. As a consequence, computer scientists proposed different approaches to simplify GPU programming, including directive-based programming models such as OpenMP and OpenACC. Their intention is to solve the aforementioned programming challenges with a directive-based approach which allows the users to insert non-executable pragma constructs that guide the compiler to handle the low-level complexities of the system. Flang, a Fortran front end for the LLVM Compiler Infrastructure, has drawn attention from the HPC community. Although Flang supports OpenMP for multicore architectures, it has no capability of offloading parallel regions to accelerator devices. In this paper, we present OpenMP Offload support in Flang targeting NVIDIA GPUs. Our goal is to investigate possible implementation strategies of OpenMP GPU offloading into Flang. The experimental results show that our approach is able to achieve performance similar to existing compilers with OpenMP GPU offload support.