Portable Parallel Performance via Multi-Dimensional Homomorphisms
Event Type
Registration Categories
TimeThursday, November 15th8:30am - 5pm
DescriptionAchieving portable performance over different parallel architectures and varying problem sizes is hard: e.g., a program optimized for multi-core CPUs on large input sizes can significantly differ from the same program optimized for Graphics Processing Units (GPUs) on small sizes.

We propose an approach to ensuring portability of performance by relying on multi-dimensional homomorphisms (MDHs) -- a class of parallelizable functions that cover important application areas including linear algebra routines (BLAS) and stencil computations. We develop an extended OpenCL implementation schema for MDHs that is generic in the performance-critical parameters of the OpenCL model, and we enable portability of performance by being automatically optimized for different target architectures and input sizes using the auto-tuning approach.

Our results demonstrate competitive and often even significantly better performance than state-of-the-art approaches for BLAS and Stencil as used in the important application area of deep learning.
Back To Top Button