This project implements a Convolutional Autoencoder for image colorization. The model is designed to take grayscale images as input and generate corresponding colorized versions. It utilizes deep learning techniques, specifically convolutional neural networks, to learn the mapping between grayscale and color images.
-
Convolutional Autoencoder: The core of the project is a convolutional autoencoder architecture, which learns to encode and decode image features to perform effective colorization.
-
Grayscale to Color: The model is trained to transform grayscale images into their corresponding colorized versions, adding vibrancy and detail to the input images.
The 1st architecture was used to colorize images.
A visualization of the architecture (using www.draw.io) is given below:
While many different parameters were tried, the best outputs were from the following parameters:
- Training on 28411 images.
- Number of parameters: 889082
- Running 30 epoch(s)
- Each epoch runs for 1421 iterations
- Learning rate: 0.001
- The model file has been included at
extra/cnn_final.pt
. - A small presentation (in pptx and pdf format) about the project has been included in the
extra
subfolder. - Example on the test set:
- Python 3.x
- Dependencies: PyTorch, Matplotlib, Torchvision, Scikit-Image
git clone https://github.com/g-nitin/convolutional-autoencoder.git
cd convolutional-autoencoder
pip install -r requirements.txt
This autoencoder was specifically designed to train on 600 by 800 pixel images.
The data
folder contains the data_builder.py
file, which uses the google-landmark dataset to only fetch images of size 600 by 800.
To get the data, just run the data_build.py
file, which will create a sub-folder (google-landmark
) in data
.
That sub-folder contains the train
and test
images in those sub-folders.
The code (running main.py
) will generate a results
folder which has sub-folders for each new model test.
To avoid pushing large amounts of data (in the results
sub-folder), the sub-folder has been ignored.
- If you are implementing another architecture, then it must be defined in
network.py
and subsequent changes must be made inutilities.py
.
This project is licensed under the MIT License - see the LICENSE file for details.
The repository was inspired by George Kamtziridis' articles and the respective code.
Feel free to contribute by opening issues or pull requests. Any feedback or improvements are welcome!
This project was inspired by the materials learned in my MATH529 (Introduction to Deep Learning) class at USC during the Fall 2023 semester.
- Email: niting1209@gmail.com