Abstract: Graph applications have specific characteristics that are not common in other application domains. In this paper, we analyze multiple graph applications on current multi- and many-core processors and provide conclusions and recommendations for future designs. We provide new insights on executing graph applications on many-core processors.
Our main novel observations are (i) some memory streams do show locality, while others show no locality, (ii) thread imbalance becomes a major problem with many threads, and (iii) many threads are required to saturate high-bandwidth memories. We recommend a selective memory access policy, where accesses with locality are cached and prefetched, while accesses without locality can remain uncached to save cache capacity. Additionally, more threads are needed, but they are not used efficiently due to thread imbalance. Our recommendation is to revise the graph analysis algorithms to provide more parallelism, and to provide a few high-performance cores that speedup sections with low parallelism.
Back to Technical Papers Archive Listing