<span class="var-sub_title">Flux: Overcoming Scheduling Challenges for Exascale Workflows</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

WORKS 2018: 13th Workshop on Workflows in Support of Large-Scale


Flux: Overcoming Scheduling Challenges for Exascale Workflows

Authors: Dong H. Ahn (Lawrence Livermore National Laboratory)

Abstract: Many emerging scientific workflows that target high-end HPC systems require complex interplay with the resource and job management software~(RJMS). However, portable, efficient and easy-to-use scheduling and execution of these workflows is still an unsolved problem. We present Flux, a novel, hierarchical RJMS infrastructure that addresses the key scheduling challenges of modern workflows in a scalable, easy-to-use, and portable manner. At the heart of Flux lies its ability to be nested seamlessly within batch allocations created by other schedulers as well as itself. Once a hierarchy of Flux instance is created within each allocation, its consistent and rich set of well-defined APIs portably and efficiently support those workflows that can often feature non-traditional execution patterns such as requirements for complex co-scheduling, massive ensembles of small jobs and coordination among jobs in an ensemble.

Archive Materials


Back to WORKS 2018: 13th Workshop on Workflows in Support of Large-Scale Archive Listing

Back to Full Workshop Archive Listing