Exascale Challenges in Across-Node Parallelism for Languages and Runtimes
Parallel Programming Languages, Libraries, and Models
TimeMonday, November 12th9:05am - 10am
DescriptionMachines with peak performance exceeding one exaflop/s are just around the corner, and promises of sustained exaflop/s machines abound. Are there significant challenges in runtime frameworks and languages that need to be met to harness the power of these machines? We will examine this question and associated issues.
There are some architectural trends that are becoming clear and some hazily appearing. Individual nodes are getting “fatter” computationally. Accelerators such as GPGPUs and possibly FPGAs are likely parts of the exascale landscape. High bandwidth memory, and non-coherent caches (such as the caches in GPGPUs used typically for constant memory), NVRAMS, and resultant deeper and more complex memory hierarchies will also have to be dealt with.
There is an argument going around in the community, that we have already figured out how to deal with tens of thousands of nodes (100,000 with BG/Q), and now since the number of nodes is not likely to increase, we (the extreme-scale HPC community) have to focus research almost entirely on within-node issues. I believe this is not quite a well-founded argument. I will explain why issues of power/energy/temperature, whole machine (multi-job) optimizations, across node issues like communication optimization, load balancing and fault tolerance are still worthy of significant attention of the exascale runtime and language community. At the same time, there exist issues in handling within-node parallelism that arise mainly or only in the context large multi-node runs.
I will also address the question of how our community should approach research if a large segment of funding sources and application community have started thinking some of the above issues are irrelevant. What should the courage of our convictions lead us to?