<span class="var-sub_title">Deep500: An HPC Deep Learning Benchmark and Competition</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Deep500: An HPC Deep Learning Benchmark and Competition

Authors: Tal Ben-Nun (ETH Zurich), Torsten Hoefler (ETH Zurich)

Abstract: This BoF discusses the creation of a new deep learning HPC benchmark and competition focused on scientific computing applications. The panel members of the BoF are experts on various aspects of the field, coming from leading universities around the world and top hardware manufacturers in the industry. The goal of the BoF is to standardize a new benchmark, which necessitates a collaborative brainstorming of the entire HPC community.

Long Description: Machine Learning (ML), and particularly Deep Learning, will soon become necessary tools in the scientific computing toolbox. The recent success of ML has sparked research into scaling up the underlying problems and mechanisms in use. We believe that it is time, as an HPC community, to acknowledge deep learning as one of the cornerstone benchmarks for large-scale computing. However, properly designing such a benchmark brings forth several questions that should be addressed:

* What are the problems that we should test? In particular, is it possible to maintain a database of relevant models and training algorithms, which are in use by the scientific computing community?

* How can we adequately represent all the layers that constitute deep neural network training? Such layers include benchmarking different hardware, software stacks, distributed communication mechanisms, etc.

* Given the multitude of datasets, performance metrics, and robustness to accuracy (e.g., mixed-precision solutions), how should the competitors be ranked? How should the unit of measurement ("AI Operations") be defined?

* As a scientific computing problem, how does the reproducibility of the results come into play?

The BoF will be formatted as a panel, followed by an open interactive discussion with the audience, of the above topics. The members of the panel will be:

* Dan Alistarh (IST Austria), a pioneer in sparse communication and quantization for ML, will discuss the communication aspects of distributed deep learning.

* Michaela Blott (Xilinx) is a researcher with unique expertise on hardware design and FPGAs for ML. She will contribute her insights on developing a hardware benchmark for deep learning.

* Pradeep Dubey (Intel) will share his experience of scaling deep learning to 15 PetaFLOPS and hardware/software stack support for deep learning on Intel architectures.

* Todd Gamblin (LLNL) will discuss scalable tools and algorithms for measurement, analysis and visualization of large-scale deep learning on supercomputers.

* Tom Gibbs (NVIDIA) will discuss applications for deep learning on large-scale data such as gravitational wave discovery.

* Torsten Hoefler (ETH Zurich) is a leading researcher in HPC and performance modeling, an active member of Graph500 and the chair of Green Graph500. He will discuss benchmark construction, scaling distributed deep learning, and performance models.

* Thorsten Kurth (LBL) will talk about supervised and semi-supervised classification for scientific data, and the use of DNNs for Physics analysis.

* Satoshi Matsuoka (Director of RIKEN R-CCS, Tokyo Tech) will discuss performance metrics for deep learning, as well as the recent advancements of his groups, including efficient use of recent hardware.

* Jidong Zhai (Tsinghua University) will share his group's experience in creating a deep learning suite, including applications and workload characterization analysis.

We expect the audience to include members from the entire spectrum of intersections between HPC and ML, including hardware designers, system engineers, compiler/runtime developers, and scientific computing researchers with ML applications. This BoF has not been held before and its expected outcome is setting up a new benchmark and competition for HPC.

Back to Birds of a Feather Archive Listing