Synthetic Data Generation for Evaluating Parallel I/O Compression Performance and Scalability
Authors: Sean B. Ziegeler (US Department of Defense HPC Modernization Program, Engility Corporation)
Abstract: Compression is one of the most common forms of data reduction and is typically the least invasive. As compute capability continues to outpace I/O bandwidths, compression becomes that much more attractive. This paper explores the scalable performance of parallel compression and presents an in-depth analysis of a coherent noise algorithm to generate synthetic data that can be used to easily evaluate parallel compression. The algorithm favors simplicity, ease-of-use, and scalability over high-fidelity reconstruction of real data, so we go to lengths to show that the synthetic data generated is suitable as a proxy for evaluating compression, especially in benchmarks and mini-apps.
Archive Materials
Back to The 4th International Workshop on Data Reduction for Big Scientific Data (DRBSD-4) Archive Listing
Back to Full Workshop Archive Listing