Bilkent University
Department of Computer Engineering
S E M I N A R

 

Hierarchical Parallel 2D-Checkerboard Partitioning for Parallel Hybrid SpMxV

 

Gündüz Vehbi Demirci
MSc Student
Computer Engineering Department
Bilkent University

Repeated matrix-vector multiplications (SpMxV) y = Ax that involve the same large, sparse, structurally symmetric or nonsymmetric square or rectangular matrix are kernel operations in various iterative solvers. Efficient parallelization of these solvers requires matrix A to be partitioned among the processors in such a way that communication overhead is kept low while maintaining computational load balance. Effective matrix partitioning models based on graph/hypergraph partitioning and efficient multi-level graph/hypergraph partitioning tools to partition these models are proposed in the literature. Among these models, two-dimensional checkerboard (2D-CHB) partitioning is promising due to its’ following three desirable properties: i) 2D-CHB partitioning, as in all 2D partitioning schemes, performs nonzero partitioning, thus it has a higher flexibility in communication volume reduction when compared with the 1D partitioning schemes, ii) it is possible to perform 2D-CHB via two consequent 1D partitionings, thus 2D-CHB has a low partitioning overhead when compared with the other 2D partitioning schemes, iii) in a P processor system, if the processors are considered as a 2D mesh, the number of messages handled by a processor after 2D-CHB partitioning is bounded above by 2×(?P - 1). Unfortunately there are no parallel tools for automatic partitioning of 2D-CHB partitioning models.

State-of-the-art petascale computing systems are built from tens of thousands of processors in a hierarchical manner (e.g. Cores, Processors, Nodes, Boards, Racks) with varying interconnection types at each level of the hierarchy. When the level of parallelism reaches to petascale, the communication-computation ratio increases significantly, reducing the possibility of masking communication costs during computation periods. Furthermore, the significant interconnection difference within the different architectural levels of supercomputers suggests handling of the data partitioning problem differently for each architectural level of petascale computing systems. The natural solution to this problem is hierarchical partitioning.

In this study we describe the development process of a parallel hierarchical 2D-CHB hypergraph partitioner. Our partitioner makes use of ParMetis partitioning tool for graphs for parallel partitioning. We first describe a hierarchical parallel 1D partitioning scheme and then show the application of this scheme to produce a parallel 2D-Checkerboard Hypergraph Partitioner.

 

DATE: 25 March, 2013, Monday @ 16:10
PLACE: EA409