Skip to content

parham1998/Neural_Style_Transfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural_Style_Transfer

Implementation of Neural Style Transfer algorithm with PyTorch library

Neural Style Transfer (NST) is one of the most fun techniques in deep learning, which generates a new image (G) by merging content image (C) and style image (S).
Actually, the purpose of this algorithm is to find an image with the same content as the content image (C) and the same style as the style image (S).

Picture1

Picture2

ConvNet

NST uses a previously trained convolutional network. (I've used VGG-19, which has been used in the original paper and has already been trained on the extensive ImageNet database)
Using the transfer learning method is necessary for this task cause we want to extract appropriate features from images and not train the model again. The model parameters are fixed, and we change the generated image parameters (pixels) to optimize the loss functions.
As seen below, the main idea is to extract features from multi layers of VGG-19: (Style features have been extracted from yellow blocks, and Content features have been extracted from the blue block)

Screenshot (428)

loss functions

there are 2 loss (cost) functions: content-loss and style-loss

content-loss:

The purpose of this loss function is to ensure that the generated image G matches the content of the image C.
The earlier (shallower) layers of a ConvNet tend to detect lower-level features such as edges and simple textures, and the later (deeper) layers tend to detect higher-level features such as more complex textures as well as object classes. you can choose any of the VGG-19 convolution layers, but you'll get the most visually pleasing results if you choose a layer in the middle of the network, neither too shallow nor too deep.
the image below shows the definition of the content-loss function:

Picture4

style-loss:

The purpose of this loss function is to ensure that the generated image G has the same style as the style image S.
I've extracted features from 5 layers to find the accurate style of the style image. the difference between content and style is that you should not match the style-image features to the generated-image features; I mean, you need to do some preprocessing to find the style matrix (Gram matrix).
In linear algebra, the gram matrix G of a set of Vectors (V1, ..., Vn) is the matrix of dot products. In other words, G(ij) compares how similar V(i) is to V(j). If they are highly similar, you would expect them to have a large dot product, and thus for G(ij) to be large.
Finding a gram matrix or correlation between channels for each layer's features is very simple; you can see the definition in the image below:

Picture5

The formula of style-loss for just one layer and the formula of the total style-loss (sum of each layer's style-loss) can be seen below:

f1 Screenshot (438)

total-loss:

Finally, let's create a loss function that minimizes both the style and the content cost. The formula is:

f3

references

L. A. Gatys, A. S. Ecker, and M. Bethge.
"Image Style Transfer Using Convolutional Neural Networks" (CVPR-2016)