Presentation
Optimizing Python Data Processing for the DESI Experiment on the NERSC Cori Supercomputer
Author/Presenters
Event Type
Workshop
W
Diversity
Education
Hot Topics
TimeSunday, November 11th2:12pm - 2:15pm
LocationD220
DescriptionThe goal of the Dark Energy Spectroscopic Instrument (DESI) experiment is to better understand dark energy by making the most detailed 3D map of the universe to date. The images obtained each night over a period of 5 years starting in 2019 will be sent to the NERSC Cori supercomputer for processing and scientific analysis.
The DESI spectroscopic pipeline for processing these data is written exclusively in Python. Writing in Python allows the DESI scientists to write very readable scientific code in a relatively short amount of time. However, the drawback is that Python can be substantially slower than more traditional HPC languages like C, C++, and Fortran.
The goal of this work is to increase the efficiency of the DESI spectroscopic data processing at NERSC while satisfying their requirement that the software remain in Python. As of this writing we have obtained speedups of over 6x and 7x on the Cori Haswell and KNL partitions, respectively. Several profiling techniques were used to determine potential areas for improvement including Python's cProfile, line_profiler, Intel Vtune, and Tau. Once we identified expensive kernels, we used the following techniques: 1) JIT-compiling hotspots using Numba (the most successful strategy so far), 2) reducing MPI data transfer where possible (i.e. replacing broadcast operations with scatter), and 3) re-structuring the code to compute and store important data rather than repeatedly calling expensive functions. We will continue using these strategies and also explore the requirements for future architectures (for example, transitioning the DESI workload to GPUs).
The DESI spectroscopic pipeline for processing these data is written exclusively in Python. Writing in Python allows the DESI scientists to write very readable scientific code in a relatively short amount of time. However, the drawback is that Python can be substantially slower than more traditional HPC languages like C, C++, and Fortran.
The goal of this work is to increase the efficiency of the DESI spectroscopic data processing at NERSC while satisfying their requirement that the software remain in Python. As of this writing we have obtained speedups of over 6x and 7x on the Cori Haswell and KNL partitions, respectively. Several profiling techniques were used to determine potential areas for improvement including Python's cProfile, line_profiler, Intel Vtune, and Tau. Once we identified expensive kernels, we used the following techniques: 1) JIT-compiling hotspots using Numba (the most successful strategy so far), 2) reducing MPI data transfer where possible (i.e. replacing broadcast operations with scatter), and 3) re-structuring the code to compute and store important data rather than repeatedly calling expensive functions. We will continue using these strategies and also explore the requirements for future architectures (for example, transitioning the DESI workload to GPUs).
Archive