Exploiting HPC Technologies for Accelerating Big Data Processing and Associated Deep Learning
TimeMonday, November 12th1:30pm - 5pm
DescriptionThe convergence of HPC, Big Data, and Deep Learning is the next game-changing business opportunity. Apache Hadoop, Spark, gRPC/TensorFlow, and Memcached are becoming standard building blocks for Big Data processing. Recent studies have shown that default designs of these components cannot efficiently leverage the features of modern HPC clusters, like RDMA-enabled high-performance interconnects, high-throughput parallel storage systems (e.g. Lustre), Non-Volatile Memory (NVM), NVMe/NVMe-over-Fabric. This tutorial will provide an in-depth overview of the architecture of Hadoop, Spark, gRPC/TensorFlow, and Memcached. We will examine the challenges in re-designing networking and I/O components of these middleware with modern interconnects and storage architectures. Using the publicly available software packages in the High-Performance Big Data project (HiBD, http://hibd.cse.ohio-state.edu), we will provide case studies of the new designs for several Hadoop/Spark/gRPC/TensorFlow/Memcached components and their associated benefits. Through these, we will also examine the interplay between high-performance interconnects, storage, and multi-core platforms to achieve the best solutions for these components and applications on modern HPC clusters. We also present in-depth case-studies with modern Deep Learning tools (e.g., Caffe, TensorFlow, CNTK, BigDL) with RDMA-enabled Hadoop, Spark, and gRPC. Finally, hands-on exercises will be carried out with RDMA-Hadoop and RDMA-Spark software stacks over a cutting-edge HPC cluster.