Written by Caleb Zulawski and Kelvin Lin for ECE411 Computational Graphs for Machine Learning at The Cooper Union. The implementation is based on Image Super-Resolution Using Deep Convolutional Networks.
Images are downloaded from ImageNet from a variety of categories. Subimages (33x33) are randomly cropped from each image and used to train the network.
The network is designed to improve the perceived quality of an image upscaled with bicubic interpolation. The network is made up of 3 convolutional layers (9x9, 64 features; 1x1, 32 features; 5x5, 3 channels). Padding is not used, so the output image is 12 pixels smaller in each dimension.
To train the network, a Gaussian blur is applied to each subimage (3x3 kernel, σ=0.2), before downsampling by 3 and interpolating by 3 using bicubic interpolation. This interpolated subimage is the input to the network, and the output is compared with the original full-resolution image. The results are validated by calculating the peak signal-to-noise ratio (PSNR) gain from the bicubic interpolation to the output of the CNN. When we stopped our training, the PSNR gain was approximately 1.2 dB.
When generating new images at a higher resolution, the image is scaled by 3 using bicubic interpolation and input to the network.
To download ImageNet, you'll need an account and access to the original ImageNet data, which is freely available for educational and non-commercial use. This will provide you with an API access key, which is needed to download the data.
Once you have an access key, create a file named imagenet_credentials.sh
which contains your credentials in the following format:
username=yourusername
accesskey=youraccesskey
Then you can run ./get_data.sh
which will download and prepare the images from the ImageNet synsets specified in wnids.txt
.
This assumes the training data has already been downloaded with the get_data.sh
script. First, create a directory for saving the model (saved_model
for the default location). To train the model, run ./run.py train
. This will continue training indefinitely–the program will only exit with a keyboard interrupt (control-c), so wait until validation results are satisfactory.
Once the model is trained, run ./run.py generate --input your_input_image.jpg
. The bicubic interpolation is written to outputs/bicubic.jpg
, and the improved interpolation is written to outputs/output.jpg
.
Original | Bicubic | Network |
---|---|---|