A Block-Oriented, Parallel, and Collective Approach to Sparse Indefinite Preconditioning on GPUs
Authors: Daniel Thuerck (Technical University Darmstadt)
Abstract: Large sparse symmetric indefinite matrices are notoriously hard to precondition. They often lack diagonal dominance and exhibit Schur-complements that render zero fill-in factorization preconditioning ineffective. Pivoting, a necessity for stable LDLt factorizations, complicates parallel approaches that can take advantage of the latest massively-parallel HPC hardware such as GPUs. We present an approach based on ad-hoc blocking and reordering strategies that allows local, independent collective-oriented processing of small dense blocks. A hybrid block-memory layout compensates for irregular memory access patterns found in sparse matrices. Our method allows restricted fill-in, supernodal pivoting and a dual threshold dropping strategy at little additional cost. It delivers robust preconditioners that in our experiments obtain an average speedup of ~6x even for tough matrices from optimization problems.
Back to IA^3 2018: 8th Workshop on Irregular Applications: Architectures and Algorithms Archive Listing
Back to Full Workshop Archive Listing