Torun T., TORUN F. Ş., Mangulu M., Aykanat C.

SIAM JOURNAL ON SCIENTIFIC COMPUTING, vol.44, no.2, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 44 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1137/21m1411603
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Agricultural & Environmental Science Database, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, MathSciNet, zbMATH, Civil Engineering Abstracts
  • Keywords: parallel Gauss-Seidel, distributed-memory, Spike algorithm, parallel sparse trian-gular solve, hypergraph partitioning, sparse matrix reordering, ALGEBRAIC MULTIGRID SOLVER, PRECONDITIONER, DECOMPOSITION, CONVERGENCE, SYSTEMS, MODELS
  • Middle East Technical University Affiliated: Yes


Gauss-Seidel (GS) is a widely used iterative method for solving sparse linear sys-tems of equations and also known to be effective as a smoother in algebraic multigrid methods. Parallelization of GS is a challenging task since solving the sparse lower triangular system in GS constitutes a sequential bottleneck at each iteration. We propose a distributed-memory parallel GS (dmpGS) by implementing a parallel sparse triangular solver (stSpike) based on the Spike algorithm. stSpike decouples the global triangular system into smaller systems that can be solved concurrently and requires the solution of a much smaller reduced sparse lower triangular system which constitutes a sequential bottleneck. In order to alleviate this bottleneck and to reduce the communication over-head of dmpGS, we propose a partitioning and reordering model consisting of two phases. The first phase is a novel hypergraph partitioning model whose partitioning objective simultaneously encodes minimizing the reduced system size and the communication volume. The second phase is an in-block row reordering method for decreasing the nonzero count of the reduced system. Extensive experi-ments on a dataset consisting of 359 sparse linear systems verify the effectiveness of the proposed partitioning and reordering model in terms of reducing the communication and the sequential com-putational overheads. Parallel experiments on 12 large systems using up to 320 cores demonstrate that the proposed model significantly improves the scalability of dmpGS.