search-icon
Exhibitor Forum
:
Enabling HPC and Deep Learning Workloads at Extreme Scale in the Cloud
Presenter
Event Type
Exhibitor Forum
Tags
Clouds and Distributed Computing
Deep Learning
TimeTuesday, November 13th2pm - 2:30pm
LocationD171
DescriptionIndependent research (Reuther et al., J. Parallel Distrib. Comput., 111, 2018, 76–92) underscores the importance of efficient workload management: “For both supercomputers and big data systems, the efficiency of the job scheduler represents a fundamental limit on the efficiency of the system.” However enabling efficiency at extreme scale in the cloud, for workload management or other purposes, requires sophisticated integration and automation that also scales. By deeply integrating with AWS-specific APIs, the capabilities of this public-cloud provider are fully leveraged via Navops Launch in a highly automated fashion. As a compelling proof point, Navops Launch makes routine the scaling of a compute cluster to more than 1,000,000 cores, across 55,000 heterogeneous spot instances spanning three availability zones. As a consequence, in demanding policy-based launching of cloud instances, heroics are no longer required to scale HPC and Deep Learning workloads to the extreme.
Back To Top Button