<span class="var-sub_title">HDF5: I/O Middleware and Ecosystem for HPC and Experimental and Observational Sciences</span> SC18 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

HDF5: I/O Middleware and Ecosystem for HPC and Experimental and Observational Sciences


Authors: Elena Pourmal (HDF Group, Lawrence Berkeley National Laboratory), Quincey Koziol (Lawrence Berkeley National Laboratory)

Abstract: We will provide a forum for the HDF5 user community to share ideas and discuss initiatives in the areas of HPC and Experimental and Observational Sciences. Elena Pourmal will present HDF5 features under development, the HDF5 roadmap, including upcoming releases and solicit input on the future roadmap. Quincey Koziol will moderate a panel with representatives from research, commercial, and government organizations who will present case studies on how they leverage HDF technologies in the fields of Experimental and Observational Sciences to solve big data problems as well as discuss the challenges of using HDF5 in the HPC environment.

Long Description: HDF5 is a unique, open-source, high-performance technology suite that consists of an abstract data model, library, and file format used for storing and managing extremely large and/or complex data collections. The technology is used worldwide by government, industry, and academia in a wide range of science, engineering, and business disciplines.

There are >1000 projects on Github utilizing HDF5 due to its (1) versatile self-describing data model that can represent very complex data objects, relationships between the objects and objects’ metadata; (2) completely portable binary file format with no limit on the number or size of data objects; (3) software library optimized for efficient I/O; and (4) tools for managing, manipulating, viewing, and analyzing HDF5 data. The HDF5 suite is included by every major HPC system vendor as part of their core software due to its broad adoption in science applications and ability to improve I/O performance and data organization within HPC environments. In addition, for more than two decades the HDF Group has been working with researchers all over the globe helping to capture, store and analyze experimental data in HDF5, for example, data collected at light sources and particle accelerators. In the past several years, the amount of experimental and observational data stored in HDF5 and the rate at which this data is collected have created new challenges for the scientists and triggered requests for new features in HDF5, which are now under development.

The HDF Group team in collaboration with LBNL is excited to present the new HDF5 features that are under development, its latest HDF5 technology roadmap, and share how its scientific, government, and industry users utilize HDF5 technologies to solve real-world problems including work performed to support the DOE Exascale Computing Project (ECP) and Experimental and Observational Data (EOD) centers with the SC18 attendees. The HDF Group encourages participants to discuss challenges when using HDF5 to help the HDF Group to prioritize items on the HDF5 roadmap. The HDF Group will also encourage discussion on how the community can contribute to the maintenance and future development of HDF5.





Back to Birds of a Feather Archive Listing