Authors:
Abstract: With ongoing improvement of computational power and memory capacity, the volume of scientific data keeps growing. To gain insights from vast amounts of data, scientists are starting to look at Big Data processing and analytics tools such as Apache Spark. In this poster, we explore Thrill, a framework for big data computation on HPC clusters that provides an interface similar to systems like Apache Spark but delivers higher performance since it is built on C++ and MPI. Using Thrill, we implemented several analytics operations to post-process and analyze data from plasma physics and molecular dynamics simulations. Those operations were implemented with less programming effort than hand-crafted data processing programs would require and obtained preliminary results which were verified by scientists at LANL.
Best Poster Finalist (BP): no
Poster: pdf
Poster summary: PDF
Back to Poster Archive Listing