PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
TimeTuesday, November 13th4:30pm - 5pm
DescriptionPattern matching is a powerful graph analysis tool. Unfortunately, existing solutions have limited scalability, support only a limited set of search patterns, and/or focus on only a subset of the real-world problems associated with pattern matching. This paper presents a new algorithmic pipeline that: (i) enables highly scalable pattern matching on labeled graphs, (ii) supports arbitrary patterns, (iii) enables trade-offs between precision and time-to-solution (while always selecting all vertices and edges that participate in matches, thus offering 100% recall), and (iv) supports a set of popular data analytics scenarios. We implement our approach on top of HavoqGT and demonstrate its advantages through strong and weak scaling experiments on massive-scale real-world (up to 257 billion edges) and synthetic (up to 4.4 trillion edges) graphs, respectively, and at scales (1,024 nodes / 36,864 cores) orders of magnitude larger than used in the past for similar problems.