Inspecting Neural Style Transfer and Playaround 🎠

In this repository I have implemented original Neural Style Transfer paper "Image Style Transfer Using Convolutional Neural Networks" and inspected how result of transfer of content and style image changes by changing weight constants, learning rates, optimizers etc.

Introduction

Style Transfer is the task of composing style from one image which is style image over another image which is content image. Before doing style transfer using neural network the major limiting factor in this task was feature representation of content and style image for better composition. Lack of such representations thwarted the way to understand the semantics and separation between the two. With the success ✔️ of VGG networks on ImageNet Challenge in Object Localization and Object Detection 🔍 , researchers gave the style transfer a neural approach.

Authors used the feature representations from VGG network to learn high and low level features of both content and style images. Using these implicit information they kept minimizing the loss between content representation and generated image representation using MSELoss and between style representation and generated image representation using MSELoss of Gram Matrices. Task of Neural Style Transfer unlike supervised learning doesn't have metric to compare performance of quality of image(s). We are not training model but updating the values of image itself in every iteration using gradient descent such that it match closely with content and style image.

I believe this brief overview of Neural Style Transfer is enough to get us started with experiments and notice some fascinating results.

Note: This is not a blog post on Neural Style Transfer. No exlpanation on the type of model, training etc is provided.

Setting Parameters

For our experiments we will set the parameters to following value until explicitly written.

iterations: 2500
fps: 30
size: 128
sav_freq: 10
alpha: 5.0
beta: 7000.0
gamma: 1.2
style_weights: [1e3/n**2 for n in [16.0,32.0,128.0,256.0,512.0]]
lr: 0.06

if path to content image and style images are not provided then default images will be used that lie inside NeuraltyleTransfer-App/src/data

For detailed understanding about these parameters go through python3 main.py -h

Reconstruct

Neural Style Transfer is like painting an image over a canvas. This canvas is of same size to that of content image since content is static and only dynamic changes that need to be composed over this canvas is of style image. Though size is same to that of content image but there are 3 - 4 ways we can initialize this canvas with, and then using gradient descent 📉 update the values of the canvas.

Following shell command can lead you to generate canvas by blending the style over content image. This is basic bash command for reconstruction of canvas, for more infomation about arguments you can go through python3 main.py --help

python3 main.py --reconstruct --content_layers <num> --style_layers 0 1 2 3 4

Noise

We can initialize the canvas with noise and then update the values to look similar to the content image having style composed on it. Using below script we generate a noise canvas and set its requires_grad = True. This enables the grad function to update the values of the following canvas.

generated_image = torch.randn(content_image.size())
generated_image.to(device, torch.float)
generated_image.requires_grad = True

Lets start with some experiments... 🔬

Changing Content Layers

bash command e.g,

python3 main.py --reconstruct --style_layers 0 1 2 3 4 --content_layers 1 --optimizer "Adam"

parameters we are using

optimizer: "Adam" 
init_image: "noise"

Content_Layer	0	1	2	3	4
Generated Canvas

on A4000 GPU it took 33s to run with current configuration for one canvas generation

Early layers have composed style over canvas relatively well than higher layers but lost the semantics of content in terminal layers. Mid level layers have preserved the content while focusing less on style composition.

Changing Optimizer

python3 main.py --reconstruct --style_layers 0 1 2 3 4 --content_layers 0 --iterations 2000

parameters we using

optimizer: "LBFGS"
init_image: "noise"

Content_Layer	0	1	2	3	4
Generated Canvas

pn A4000 GPU it took 120s to run with current configuration for one canvas generation

Again early layers composed style over canvas relatively well than higher layers but while moving towards higher layers canvas is losing content representation maybe due to over composition of style. Last layer has again lost semantics to quite some extent.

Content

We can initialize the canvas with content image itself and then update the values to look similar to the content image having style composed on it. Using below line of code we initiate canvas with content image.

generated_image = content_image.clone().requires_grad_(True)

lets' start with some experiments...:microscope:

Changing Optimizer

bash command e.g,

python3 main.py --reconstruct --style_layers 0 1 2 3 4 --content_layers 1 --optimizer "Adam" --init_image "content"

Content_Layers	0	1	2	3	4
Adam
LBFGS
Adam

In first two rows the only change is in use of optimizer, and clearly both the optimizer produces comparitively similar canvas except in the last layer. Also Adam needs more iterations to produce semantically similar canvas to that of what LBFGS produce but at the same time former is quite fast to compute since its first order method and doesn't compute curvature of parameter space like latter.

So we used Adam once again on different set of content and style image(last row) to generate canvas and found that last layer in all the cases loses some content information and style over composes on canvas. First two layers are giving comparitively better results all the time.

Style

We can initialize the canvas with style image itself and then update the values to look similar to the content image having style composed on it. Using below line of code we initiate canvas with content image.

generated_image = style_image.clone().requires_grad_(True)

lets' start with some experiments... 🔬

Changing Optimizer

python3 main.py --reconstruct --style_layers 0 1 2 3 4 --content_layers 1 --optimizer "Adam" --init_image "style"

Content_Layers	0	1	2	3	4
Adam

Composing content representation over style canvas doesn't seem like a great idea. Last layers over composed the style with some noise, content_layer: 2 smoothen out the background highlighting content.

Further Studies

From above experiments we can infer that in content_layer: 4 canvas has lost semantics to some extent due to either over composing of style or under-representation of content representation. We can infer that out in Visualization by looking at what each layer is contributing in generating canvas. The same can be said for content_layer: 3 but with relatively less prominence than the former.

In content_layer: 0 we can see that style is well composed over canvas while also preserving the content representation, same can be said for content_layer: 1 but with less prominence. So for further experiment lets' use content_layer: 0 and Adam for fast computation. Currently we have seen all the canvases generated by conv layers, lets experiment with relu now.

Content_Layers	0	1	2	3	4
conv
relu

looking at all the canvases from conv and relu we can infer that both the layers don't output very different canvases, and its safe to use either of layers for reconstruction.

Visualization

Until now we have reconstructed canvases using all the style layers and any one content layer, but in this section we will visualize the individual and grouped contribution of style and content layers. We have 3 ways to do so, either only visualizing content layer(s), or visualizing style layers(s) or both layers.

shell command to visualize is

python3 main.py --visualize "content" --content_layers 1 2 --iterations 1500 --fps 30 --sav_freq 5

ContentV

when --visualize "content" then we can only visualize the content representation of any layer or by grouping some layers.

Content_Layers	0	1	2	3	4
Canvas

Latter layers are capturing textures of the content image while not giving much weightage to color and low level feature details. Although content_layer: 4 canvas seems to have under-representated the content representation maybe due to insufficient number of gradients flowing back to canvas for update

Earlier layers captured the shape and somewhat texture really well.

What if we arbitralily choose some content layers and find the output of their resultant on canvas, lets check

python3 main.py --visualize "content" --content_layers 1 3 4 ---iterations 700 --fps 2 --sav_freq 5

Content_Layers	1 3 4	0 2 4
Canvas

StyleV

when --visualize "style" then we can only visualize the style representation of any layer or by grouping some layers.

Style_Layers	0	1	2	3	4
Adam

Style layers when visualized individualy seems to have not been contributing any significant style to canvas, infact while moving towards higher layers we see patterns of noise.

What if we arbitralily choose some content layers and find the output of their resultant on canvas, lets check

python3 main.py --visualize "style" --content_layers 1 3 4 ---iterations 2000 --fps 25 --sav_freq 8 --optimizer "Adam"

Style_Layers	0 1 4	1 2 3	0 1
Adam
LBFGS

canvas output when all the style layers were used

When visualized grouped contribution of layers we can see some style over canvas very clearly. LBGFS shows style in every canvas even when Adam failed to in style_layers: 1 2 3. On further looking into the matter we found that Adam too atleast 4000 iterations to learn the representations and output visually appealing style in comparision to others. The reason behind it can be that higher layers don't focus more on colors but on texture and Adam find it hard to extract the color features information than LBFGS.

In the last we can visualize what all the style layers are contributing to the canvas, it looks more similar to style image itself.

Both

For fun we will use all the style and content layers to generate the canvas, although this configuration worked for the below image but no for many.

Original image of lion was grey.

You can play with with other hyperparameters to generate canvases and enhance your understanding of Neural Style Transfer

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
nst.py		nst.py
requirements.txt		requirements.txt
util.py		util.py

priyam314/NeuralStyleTransfer-App

Folders and files

Latest commit

History

Repository files navigation

Inspecting Neural Style Transfer and Playaround 🎠

Contents

Introduction

Setting Parameters

Reconstruct

Noise

Changing Content Layers

Changing Optimizer

Content

Changing Optimizer

Style

Changing Optimizer

Further Studies

Visualization

ContentV

StyleV

Both

About

Topics

Resources

Stars

Watchers

Forks

Languages