Bilkent University
Department of Computer Engineering


Node-Wise Model Parallelism for Distributed Convolutional Neural Networks


Burak Eserol
MS Student
Computer Engineering Department
Bilkent University

Convolutional Neural Networks has become very popular through-out the last years due to their success and they become even better as deeper networks are achieved. Scalability of the convolutional neural networks start to be a problem as they become deeper and speed of the training process starts to be a bottleneck. Training of well-known deep convolutional neural networks on ImageNet dataset takes weeks. In order to speed up the training process for deep convolutional neural networks, three different parallelization techniques are used. These techniques are data parallelism where data is partitioned into data shards and used by neural network models which are replicated, model parallelism where the network is partitioned by layers and works as a pipelined structure, hybrid parallelism where both parallelism methods are used together. We developed a new parallelism method which is called node-wise model parallelism. In this method instead of partitioning a network by layers, we partition the network by layer nodes. In order to achieve this we convert a deep convolutional neural network into a hypergraph and then we partition it using Patoh. In this study, we investigate node-wise model parallelism with other parallelism methods in terms of work balance, communication volume and convergence performance.


DATE: 05 March, 2018, Monday, CS590 & CS690 presentations begin at @ 15:40