Bilkent University
Department of Computer Engineering


Fully Convolutional Networks with Spectral Pooling Methods


Onur Aydın
MS Student
Computer Engineering Department
Bilkent University

Due to recent advances in deep learning, learning machineries in computer vision research and the mainstream approaches have changed dramatically from hard-coded features combined with classifiers to end-to-end trained convolutional neural networks (CNN) and fully convolutional networks (FCN), which give the state-of-the-art results in image classification and semantic segmentation respectively. Max pooling layer in convolutional neural networks is used to spatially summarize information contained within an image. Although it is used widely, max pooling operation is inadequate to keep enough spatial information and has restrictions in the choice of output dimension size. We propose to use an alternative pooling approach for fully convolutional networks. Spectral pooling performs downsampling operation in the frequency domain using discrete Fourier transform and further improve it using discrete cosine transform which brings simplicity, more energy compaction, and a good approximation to Karhunen–Loève transform. Spectral pooling methods preserve considerably more spatial information than max pooling and provide flexibility in output dimensions. It is expected that using the spectral pooling method in fully convolutional networks enables more accurate and detailed segmentation results, even without using skip connections that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer. Furthermore, due to flexibility in output dimensions, spectral pooling provides to construct deeper fully convolutional network architectures which leads to learning more levels of abstraction from images and obtaining more accurate segmentation results. Additionally, with a similar approach, we are planning to investigate a deconvolution network using spectral unpooling in order to move from a low resolution representation back to a high resolution one. As a future task, we are intending to parametrize convolutional networks completely in frequency domain using discrete cosine transform and to improve the spectral pooling method with discrete wavelet transform.


DATE: 24 October, 2016, Monday @ 17:00