Presentation

· Presenters · Organizations · Search Program · Flagged · Happening Now · Maps · Notifications

Workshop

: Optimizing Python Data Processing for the DESI Experiment on the NERSC Cori Supercomputer

SessionWomen in HPC: Diversifying the HPC Community

Author/Presenters

Laurie A. Stephey

Rollin C. Thomas

Stephen J. Bailey

Event Type

Workshop

Registration Categories

Tags

TimeSunday, November 11th2:12pm - 2:15pm

LocationD220

DescriptionThe goal of the Dark Energy Spectroscopic Instrument (DESI) experiment is to better understand dark energy by making the most detailed 3D map of the universe to date. The images obtained each night over a period of 5 years starting in 2019 will be sent to the NERSC Cori supercomputer for processing and scientific analysis.

The DESI spectroscopic pipeline for processing these data is written exclusively in Python. Writing in Python allows the DESI scientists to write very readable scientific code in a relatively short amount of time. However, the drawback is that Python can be substantially slower than more traditional HPC languages like C, C++, and Fortran.

The goal of this work is to increase the efficiency of the DESI spectroscopic data processing at NERSC while satisfying their requirement that the software remain in Python. As of this writing we have obtained speedups of over 6x and 7x on the Cori Haswell and KNL partitions, respectively. Several profiling techniques were used to determine potential areas for improvement including Python's cProfile, line_profiler, Intel Vtune, and Tau. Once we identified expensive kernels, we used the following techniques: 1) JIT-compiling hotspots using Numba (the most successful strategy so far), 2) reducing MPI data transfer where possible (i.e. replacing broadcast operations with scatter), and 3) re-structuring the code to compute and store important data rather than repeatedly calling expensive functions. We will continue using these strategies and also explore the requirements for future architectures (for example, transitioning the DESI workload to GPUs).

Program November 11–16, 2018

Exhibits November 12–15, 2018

KAY BAILEY HUTCHISON CONVENTION CENTER DALLAS

The International Conference for High Performance
Computing, Networking, Storage, and Analysis

Presentation