Authors:
Abstract: Triangle counting algorithm enumerates the triangles in a graph by identifying the common neighbors between two vertices of every edge. In this work, we present TriCore, a new GPU-based high-performance and scalable triangle counting system that consists of three main techniques. First, we design a binary search based counting algorithm that tremendously increases both thread parallelism and memory performance. Second, TriCore exploits a 2-D partition method to distribute the CSR representation across multiple GPUs, combined with a new streaming buffer to load the edge list from outside of GPUs. Third, we develop a dynamic workload management technique to balance the workload across multiple GPUs. Our evaluation demonstrates TriCore is 22× faster than the state-of-the-art parallel triangle counting projects. In addition, TriCore can not only process big graphs that are significant larger than the memory size of one GPU but also achieve 24× speedup when scaling to 32 GPUs.
Presentation: file
Back to Technical Papers Archive Listing