Bilkent University
Department of Computer Engineering


Style Synthesizing Conditional Generative Adversarial Networks


Yarkın Deniz Çetin
MS Student
(Supervisor: Assoc. Prof. Dr. Selim Aksoy)
(Co-Supervisor: Asst. Prof. Dr. R. Gökberk Cinbiş)

Computer Engineering Department
Bilkent University

Neural style transfer (NST) models makes generating images imitating an artistic style possible. Models which can transfer arbitrary styles without retraining or architecture change exist called universal style transfer (UST) models. For UST models, artistic style is usually provided by an artistic image as an input. It is therefore required to have a style image with the required characteristics to facilitate transfer. However for applications where only the broad style is pre-determined and not the specifics of the style, such image may not exist or can be difficult to find.

In this work we propose a style transfer network which can create stylized images without the need of a style image or inference time and instead uses a conditioning label denoting the desired style. The conditioning label can contain multiple styles. Our model can generate diverse examples from given style categories.

This requires the model to be able to learn one-to-many mapping, from a single class label to multiple styles. For this reason we build our model based on generative adversarial networks (GAN), which have been shown to generate realistic data in highly complex and multi-modal distributions in variety of domains. More specifically, we design a conditional GAN model that takes a conditioning and noise vectors as an input and outputs the necessary statistics for style transfer.

Our approach builds upon a universal style model. This model uses an auto-encoder scheme to extract the features of content and style images and modifying the statistics of the content features to match to statistics of the style image by using whitening and coloring transforms. We modify whitening and coloring transforms such that it can generate images directly from the diagonal feature covariance matrix and the first order statistics of the feature matrix. This approximation enables higher performance when generating images.

We demonstrate that our approximation method can achieve similar performance in quality and better performance in speed. We use a subset of WikiArt dataset to train and validate our models and form the baselines using the original model. Our model is controllable through style conditioning labels. This allows our model to generate style combinations not included in the dataset and synthesize novel styles. These combined styles are shown to be correlated well with actual style images not seen by our model.


DATE: 08 January 2020, Wednesday @ 09:00