Feature-Relevant Data Reduction for In Situ Workflows
Authors: Will Fox (Massachusetts Institute of Technology)
Abstract: As the amount of data produced by HPC simulations continues to grow and I/O throughput fails to keep up, in situ data reduction is becoming an increasingly necessary component of HPC workflows. Application scientists, however, prefer to avoid reduction in order to preserve data fidelity for post-hoc analysis. In an attempt to compromise between data quality and data quantity, this work introduces the concept of feature-relevant compression. We explore two scientific datasets in an attempt to quantify the impacts of compression on features of interest by identifying such features and analyzing changes in their properties after compression. We find that it is indeed possible to compress simulation data in a lossy manner while preserving desired properties within a predetermined error rate. Additionally, we suggest that this error quantification could be applied as part of an in situ workflow to dynamically tune compression parameters during simulation, compressing aggressively when features are simple but preserving structure where data complexity increases. Future work should focus on implementation, extension to additional compression algorithms, and analysis of these techniques on quantities which are derived from original simulation data.
Back to The 4th International Workshop on Data Reduction for Big Scientific Data (DRBSD-4) Archive Listing
Back to Full Workshop Archive Listing