Authors: Kevin Pedretti (Sandia National Laboratories), Michael Ott (Leibniz Supercomputing Centre), Robin Pinning (Hartree Centre), Greg Koenig (Energy Efficient HPC Working Group), Matthias Maiterth (Ludwig Maximilian University of Munich, Intel Corporation), Vadim Elisseev (IBM), Moe Jette (SchedMD LLC)
Abstract: Energy and power aware job scheduling and resource management (EPAJSRM) capabilities are implemented or planned for large-scale HPC systems in ~10 sites worldwide. Some of the sites are interested in using these capabilities to allow an application to provide hints and other relevant information to an EPAJSRM job scheduler. Another important capability is to notify applications of power management decisions, such as changes in power usage targets and providing awareness of what is going on in the machine that might have made a job run slower. This BoF explores the these capabilities from the perspective of three different sites.
Long Description: The capability to manage power and energy of HPC systems from the job scheduler- like other shared resources such as burst buffers, file-system IO, and the network- will become an important issue for upcoming Exascale systems. Some of these systems may have power constraints while others may have an energy budget allocated that must not be exceeded.
The goal of this BoF is to explore three different site's requirements and vision for developing a two-way interface that allows the application to provide hints or information to the job scheduler and for the job scheduler to provide information on power or energy constraints and how they are being enforced. Ideally it would also provide historical information of an application’s runtime characteristics to allow for better scheduling decisions. This could be a general solution that is not specific to the runtime or the job scheduler but acts as a middleware that links the application with the job scheduler.
The Energy Efficient HPC Working Group has held two prior BoFs (SC17 and ISC18) that described the commonalities and differences between ~10 sites worldwide that are employing energy and power-aware job scheduling and resource management (EPAJSRM) techniques and started a discussion on implications and analysis from a survey of these sites. Both of these prior BoFs were very well attended and generated a lot of interest from the broader community. This BoF will take that work a step further and dive down into discussing next steps for EPAJSRM. What direction should future work take in this area?
This is extremely relevant to the ~10 sites that have deployed EPAJSRM capabilities, but it is also relevant to those sites that are considering implementation of them. It is also relevant to the vendor community; particularly the system integrators like IBM, Cray, HPE and Fujitsu who have had to deliver these capabilities to the ~10 sites. It is also of interest to the JSRM vendors like IBM, SLURM and MOAB. Finally, it is also of interest to the academic community as research is suggested by the future looking requirements.
The BoF session leaders are aware of three power BoF submissions this year. We have coordinated their scope in order to avoid overlap. The PowerAPI/Redfish BoF drives the community toward interface standardization to enable portable power interfaces from tools, applications and middleware. The JSRM BoF primarily discusses site level power management issues and requirements (inclusive of both infrastructure and HPC systems). The PowerStack design incorporates these requirements and engages the audience in design feedback and brainstorming solutions to implementation challenges.
Back to Birds of a Feather Archive Listing