Bilkent University
Department of Computer Engineering


Performance improvement on latency-bound parallel HPC


Mustafa Duymuş
MS Student
(Supervisor: Prof. Dr. Cevdet Aykanat)
Computer Engineering Department
Bilkent University

The performance of paralellized High Performance Computing (HPC) applications is tied to the efficiency of the underlying processor-to-processor communication. In latency-bound applications, the performance runs into bottleneck by the processor that is sending the maximum number of messages to the other processors. To reduce the latency overhead, we propose a two-phase message-sharing-based algorithm, where the bottleneck processor (the processor sending the maximum number of messages) is paired with another processor. In the first phase, the bottleneck processor is paired with the processor that has the maximum number of common outgoing messages. In the second phase, the bottleneck processor is paired with the processor that has the minimum number of outgoing messages. In both phases, the processor pair share the common outgoing messages between them, reducing their total number of outgoing messages, but especially the number of outgoing messages of the bottleneck processor. We use Sparse Matrix-Vector Multiplication as the kernel application and a 512-processor setting for the experiments. The proposed message-sharing algorithm achieves a reduction of 84% in the number of messages sent by the bottleneck processor and a reduction of 60% in the total number of messages in the system.


DATE: 02 February 2021, Monday @ 11:00