Flux: Overcoming Scheduling Challenges for Exascale Workflows
Authors: Dong H. Ahn (Lawrence Livermore National Laboratory)
Abstract: Many emerging scientific workflows that target high-end HPC systems require complex interplay with the resource and job management software~(RJMS). However, portable, efficient and easy-to-use scheduling and execution of these workflows is still an unsolved problem. We present Flux, a novel, hierarchical RJMS infrastructure that addresses the key scheduling challenges of modern workflows in a scalable, easy-to-use, and portable manner. At the heart of Flux lies its ability to be nested seamlessly within batch allocations created by other schedulers as well as itself. Once a hierarchy of Flux instance is created within each allocation, its consistent and rich set of well-defined APIs portably and efficiently support those workflows that can often feature non-traditional execution patterns such as requirements for complex co-scheduling, massive ensembles of small jobs and coordination among jobs in an ensemble.
Back to WORKS 2018: 13th Workshop on Workflows in Support of Large-Scale Archive Listing
Back to Full Workshop Archive Listing