Using Thrill to Process Scientific Data on HPC
Event Type
Registration Categories
TimeThursday, November 15th8:30am - 5pm
DescriptionWith ongoing improvement of computational power and memory capacity, the volume of scientific data keeps growing. To gain insights from vast amounts of data, scientists are starting to look at Big Data processing and analytics tools such as Apache Spark. In this poster, we explore Thrill, a framework for big data computation on HPC clusters that provides an interface similar to systems like Apache Spark but delivers higher performance since it is built on C++ and MPI. Using Thrill, we implemented several analytics operations to post-process and analyze data from plasma physics and molecular dynamics simulations. Those operations were implemented with less programming effort than hand-crafted data processing programs would require and obtained preliminary results which were verified by scientists at LANL.
Back To Top Button