Skip to content

ibelderbos/gan-for-class-imbalance

Repository files navigation

User guide

This README will explain how to set-up and run the implementation code for my research project: 'GANs based Aerial Images Generation for Imbalanced Learning', accepted for International Conference on Pattern Recognition and Artificial Intelligence 2022. Project performed in collaboration with Centraal Bureau voor de Statistiek (Statistics Netherlands).

Abstract

In this paper, we examine whether we can use Generative Adversarial Networks as an oversampling technique for a largely imbalanced remote sensing dataset containing solar panels, endeavoring a better generalization ability on another geographical location. To this cause, we first analyze the image data by using several clustering methods on latent feature information extracted by a fine-tuned VGG16 network. These feature-based clusters are substantiated by visualizing their samples in lower dimensional space with t-SNE. After that, we use the cluster assignments as auxiliary input for training the GANs. In our experiments we have used three types of GANs: (1) conditional vanilla GANs, (2) conditional Wasserstein GANs, and (3) conditional Self-Attention GANs. The synthetic data generated by each of these GANs is evaluated by both the Fr´echet Inception Distance and a comparison of a VGG11-based classification model with and without adding the generated positive images to the original source set. We show that all models are able to generate realistic outputs as well as improving the target performance. Furthermore, using the clusters as a GAN input showed to give a more diversified feature representation, improving stability of learning and lowering the risk of mode collapse. Keywords: generative adversarial networks, imbalanced learning, image classification, deep learning, remote sensing

research_process

Installation

The required modules can be installed via:

pip install -r requirements.txt

Quick start

The following code can be run to execute the training of the GAN. You can choose your own hyperparameters with the ArgumentParser. An example for a model with name my_first_gan and learning rate 0.0001 is shown below. If parameter settings are not chosen, the default parameters will be used.

python train.py --model_name 'my_first_gan' --lr 0.0001

The file automatically creates the directories for the checkpoints, loss plots and sample of generates images. Make sure to choose the right directory to store the loss plots and grid of images result_path, the directory where the data is stores dataset_path and the path to the csv file with the cluster labels path_to_csv in line 79 to 81 in train.py.

Data

In order to access the data, please send a request to:

Citation

Please use the following BibTeX reference when citing this code:

@inproceedings{belderbos2022gans,
  title={GANs Based Conditional Aerial Images Generation for Imbalanced Learning},
  author={Belderbos, Itzel and de Jong, Tim and Popa, Mirela},
  booktitle={Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, June 1--3, 2022, Proceedings, Part II},
  pages={330--342},
  year={2022}
}

About

Implementation of my research project 'Conditional Generaton of Aerial Images for Imbalanced Learning using Generative Adversarial Networks'.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages