Skip to content

EL-GY 6123: Introduction to Machine Learning (Graduate) course project

Notifications You must be signed in to change notification settings

Rajat-R-Bapuri/introml-project

Repository files navigation

Fast Style Transfer using Keras

Rendering the semantic content of an image in different styles is a difficult image processing task. A task which a human would take days or months to complete is done within seconds using neural networks. We used pretrained models VGG-16 and VGG-19 using image and video as inputs to compare their quality of style transfer and performance. We also experimented by adding texture upon style on the content image. The code has been commented for better understanding.


Content Image + Style Image = Stylized Image


Results

Image Style Transfer using VGG-16 and VGG-19

Content and Style image

VGG16 and VGG19 output images

The outcome of texture and style addition on content image using VGG-16

Content Image + Style Image + Texture Image = Texture Style Output Image

Content, Style and Texture Images

Style Output without Texture and Style and Texture Output Image

Video Style Transfer using VGG-19

Content Video

Stylized Video

Implementation

We train with MS-COCO training set and prepare low-resolution inputs by blurring with a Gaussian kernel of width σ = 1.0 and down-sampling with bicubic interpolation. We resize each of the 80k training images to 256 × 256 and train with 2 epochs on the whole dataset. We use Adam with learning rate 1 × 10−3. The output images are regularized with total variation regularization with a strength of between 1 × 10−6 and 1×10−4, chosen via cross-validation per style target. For style transfer experiments, we compute feature reconstruction loss at layer ‘block3_conv3’ and style reconstruction loss at layers ‘block1_conv2’, ‘block2_conv2’, ‘block3_conv3’ and ‘block4_conv3’ of the VGG-16 loss network. For VGG-19, we compute feature reconstruction loss at layer ‘block4_conv2’ and style reconstruction loss at layers ‘block1_conv1’, ‘block2_conv1’, ‘block3_conv1’, ‘block4_conv1’ and ‘block5_conv1’ of the loss network. The code has been implemented in Keras using Tensorflow as the backend. Training takes roughly 4 hours on a single P40 GPU.

Usage

1. Training models VGG-16 and VGG-19

python3 train.py --style /style image path/ --output /file name without extension/

Note: Dataset should be kept in images/dataset location.
Uncomment line 17 and comment line 20 of train.py for training VGG-19.

2. Image Style transfer using VGG-16 and VGG-19 models

python3 transform.py --style /style model path/ --input /path to content image/

Note: Uncomment line 22 and comment line 25 of transform.py for VGG-19 model for Style transfer.

3. Texture and style addition to content image

python3 texture_trasform.py --texture /texture model file path/ --style /style model file path/ --input /file path/ --output /output filename with out extension/

4. Video Style Transfer using VGG-16 model

python3 video_transform.py --input --style /style model path/ --output /filename without extension/

Acknowledgement

We would like to thank Prof. Sundeep Rangan for advising us for the project and providing Google Cloud Platform GPU credits. Also, we would like to thank Sam Lee and Logan Engstrom whose Style Transfer code open-sourced on GitHub has been used and modified in this project.

About

EL-GY 6123: Introduction to Machine Learning (Graduate) course project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages