Synthetic Data Generation for Evaluating Parallel I/O Compression Performance and Scalability

<span class="var-sub_title">Synthetic Data Generation for Evaluating Parallel I/O Compression Performance and Scalability</span> SC18 Proceedings

The 4th International Workshop on Data Reduction for Big Scientific Data (DRBSD-4)

Synthetic Data Generation for Evaluating Parallel I/O Compression Performance and Scalability

Authors: Sean B. Ziegeler (US Department of Defense HPC Modernization Program, Engility Corporation)

Abstract: Compression is one of the most common forms of data reduction and is typically the least invasive. As compute capability continues to outpace I/O bandwidths, compression becomes that much more attractive. This paper explores the scalable performance of parallel compression and presents an in-depth analysis of a coherent noise algorithm to generate synthetic data that can be used to easily evaluate parallel compression. The algorithm favors simplicity, ease-of-use, and scalability over high-fidelity reconstruction of real data, so we go to lengths to show that the synthetic data generated is suitable as a proxy for evaluating compression, especially in benchmarks and mini-apps.

Archive Materials

Back to The 4th International Workshop on Data Reduction for Big Scientific Data (DRBSD-4) Archive Listing

Back to Full Workshop Archive Listing