Authors: Meghan McClelland (Versity Software Inc), Gary Grider (Los Alamos National Laboratory), Matt Starr (Spectra Logic Corporation), Stefaan Vervaet (Western Digital Corporation), Jim Gerry (IBM), Bruce Gilpin (Versity Software Inc)
Abstract: Archives are changing. With exascale rapidly approaching and new workloads such as machine learning and IOT/instruments shifting the use paradigm to read more than write, HPC organizations need archival storage systems that can support the rapidly increasing scale and changing workloads while meeting the capacity and financial requirements of the organization.
In this interactive BoF, an expert panel of leaders in the archiving industry will discuss and debate key archive requirements, how archives need to change to meet modern workflows, cost considerations for archives, and what long term data storage in future data centers will look like.
Long Description: This BOF is designed to bring together expert consumers and providers from the large scale data archiving field to contribute to a lively discussion with the community about long term data storage challenges for exascale. With performance, capacity, and cost constraints at odds with each other, the discussion of how to balance these elements is critical to data centers at scale.
Archives have a reputation for serving as static parking lots for unwanted data that people are afraid to delete. The panel has experience building and supporting systems where nothing could be farther from the truth. Modern workloads have necessitated a shift in the use model to an active system that is read much more often than written and that is accessed frequently. Small files and file counts continue to present major challenges for POSIX filesystems and with the Internet of Things and advanced instruments collecting ever larger volumes of data this problem is only getting worse. Managing metadata in massive namespaces has become a central issue for at large scale systems.
A panel of experts will discuss what they think the data center of the future looks like in terms of archiving, focusing on key archive requirements, media and tiering, and balancing cost/scale/performance/capacity.
Panelists will include:
Bruce Gilpin, Versity Matt Starr, Spectra Jim Gerry, IBM Gary Grider, LANL/DOE Henry Newman, Seagate Government Solutions
Meghan McClelland, Versity, will moderate.
The panelists are experts in the archiving/storage industry. With a combined hundreds of years of experience in storage, each panelist has been carefully selected to represent a different perspective and opinion. The panel is sure to be lively and informative.
The goal of the BOF is to have a discussion on futures of archiving for exascale systems and the role in the data center. The HPC audience in attendance will be very interested in the topic of archiving at scale in general because the memory and primary filesystems that are used initially computing the data are too expensive to provide enough capacity to store everything that is needed. Participants are thinking about futures of storage for their systems and will benefit from hearing from experts, some with opposing points of view which will add to educating and assisting with information for options to explore.
Although this BOF is a brand new one for SC, the topics of data storage and exscale are certainly not, in fact being able to bust data from one system to another has been a topic of much discussion over the past years. With storage budgets typically being a very small portion of the overall HPC machine purchase, long term storage budgets are very constrained but are simultaneously expected to keep up with exascale workloads. The audience will benefit from the expert panel and have ample opportunity to ask questions and participate in the interactive discussion.
Back to Birds of a Feather Archive Listing