<span class="var-sub_title">Exascale Machine Learning</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Exascale Machine Learning


Authors: Naoya Maruyama (Lawrence Livermore National Laboratory), Brian Van Essen (Lawrence Livermore National Laboratory), Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract: HPC has been making a profound impact in recent success of data-driven machine learning. Training deep learning models such as convolutional and recurrent neural networks with big data is an extremely compute intensive task that routinely benefits from parallel and distributed processing. Upcoming exascale machines are expected to further accelerate the advance of data-oriented machine learning. This BoF starts with short talks by experts from academia, national labs and industry about the current state of the art in HPC-based machine learning, and concludes with a panel session that provides a forum to discuss opportunities and challenges with future exascale systems.

Long Description: Data-oriented machine learning, represented by deep neural networks, has shown tremendous success in an extremely wide range of domains, including image recognition, natural language understanding, autonomous driving, and scientific and engineering problems where numerical simulations were exclusively used as the computational approach. The current success of machine learning is in part due to the HPC technologies that have been developed by the SC and other related communities, most notably heterogeneous computing with accelerators. Future development of HPC technologies is also vital for addressing ever growing computing needs in data-oriented machine learning. In this BoF, we will focus on machine learning from the perspectives of HPC technologies, in particular opportunities and challenges with the current large-scale supercomputers and future exascale systems.

Our speakers includes leading experts from academia, national labs and industry. Francis Alexander, the principal investigator of Exalearn, the newly established US ECP codesign center, will provide perspectives on the roles by machine learning to ECP applications. Rio Yokota and Bill Tang will present latest results on large-scale deep learning using some of the largest machines in the world. Brian Van Essen, the project lead of the LBANN deep learning framework, will discuss algorithms and techniques to address scalability challenges in large-scale systems such as LLNL’s Sierra. Satoshi Matsuoka, Directory of RIKEN CCS, will discuss how machine learning will be accelerated with the current and future leading machines in Japan, such as the Post K computer. Maxim Naumov from Facebook and Michael Houston from NVIDIA will introduce industry perspectives into the discussion.

The follow-up panel session explores opportunities and challenges with future exascale systems as well as current largest machines. In order to kickstart lively discussions, we will prepare several questions to the panelists and audices. Examples are: What are some of the most challenging barriers for exploiting large scale machines in machine learning? What can machine learning enable for scientific and engineering problems with future exascale machines? What are some of major roadblocks for integrating machine learning into traditional simulation-based workloads?

The BoF provides an additional forum for bringing together experts in the cross-cutting domain of ML and HPC at SC18, in addition to the existing programs such as the ML-HPC workshop. This BoF will further strengthen the machine learning aspects of the conference technical program, which has been recently receiving rapid increase of interest among conference attendees.

As a final follow-up, we will publish a summary of BoF discussions as well as presentation slides.

While this is a first BoF proposal at SC, part of the session leaders and speakers also organized a minisymposium at SIAM PP’18 on deep learning in March 2018 (http://meetings.siam.org/sess/dsp_programsess.cfm?SESSIONCODE=63584).





Back to Birds of a Feather Archive Listing