SimFS: A Simulation Data Virtualizing File System Interface
DescriptionIn the big (simulation) data era, simulations often produce petabytes of data to be stored in parallel filesystems or large-scale databases. This data is accessed, often by thousands of analysts and scientists, over the course of decades. However, storing these volumes of data for long time periods of time is not cost effective and, in some cases, practically impossible.

SimFS transparently virtualizes the simulation output, relaxing the storage requirements and re-simulating missing data on-demand. SimFS monitors the analysis access pattern in order to decide (1) which data to store and (2) apply prefetching techniques to improve the analysis performance. SimFS enables a trade-off between on-disk solutions, where all the simulation data is stored on disk, and in-situ, where no data is stored and analyses are always coupled with simulations. Overall, by exploiting the growing computing power and relaxing the storage capacity requirements, SimFS offers a viable path towards exa-scale simulations.
