Portable and Reusable Deep Learning Infrastructure with Containers to Accelerate Cancer Studies
Abstract: Advanced programming models, domain specific languages, and scripting toolkits have the potential to greatly accelerate the adoption of high performance computing. These complex software systems, however, are often difficult to install and maintain, especially on exotic high-end systems. We consider deep learning workflows used on petascale systems and redeployment on research clusters using containers. Containers are used to deploy the MPI-based infrastructure, but challenges in efficiency, usability, and complexity must be overcome. In this work, we address these challenges through enhancements to a unified workflow system that manages interaction with the container abstraction, the cluster scheduler, and the programming tools. We also report results from running the application on our system, harnessing 298~TFLOPS (single precision).
Back to ESPM2 2018: Fourth International Workshop on Extreme Scale Programming Models and Middleware Archive Listing
Back to Full Workshop Archive Listing