Skip to content
/ gandtr Public

PyTorch implementation of Dark Side Augmentation

License

Notifications You must be signed in to change notification settings

mohwald/gandtr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GAN domain translation for recognition

arXiv | Paper with supplementary | Video (5m)


Codebase for the publication:

Dark Side Augmentation: Generating Diverse Night Examples for Metric Learning [arXiv]. Albert Mohwald, Tomas Jenicek and Ondřej Chum. In International Conference on Computer Vision (ICCV), 2023.

This repository builds on top of image retrieval implemented in mdir and cirtorch and adapts CycleGAN and CUT for image-to-image translation.

train_and_finetune


Pretrained models

Model Weights
CycleGAN (day-to-night) Download
HEDNGAN (day-to-night) Download
Model Avg Tokyo ROxf RPar Weights Whitening
GeM VGG16 CycleGAN 74.0 90.2 60.7 71.0 Download Download
GeM VGG16 HEDNGAN 73.5 88.8 61.1 70.7 Download Download
GeM ResNet-101 CycleGAN 78.4 92.0 66.8 76.4 Download Download
GeM ResNet-101 HEDNGAN 78.4 91.7 66.6 76.8 Download Download

All models are pretrained on Retrieval-SfM 120k.

Torch Hub

To use any pretrained model, please follow PyTorch installation instructions.

import torch

# Day-to-night generators
cyclegan = torch.hub.load('mohwald/gandtr', 'cyclegan', pretrained=True)
hedngan = torch.hub.load('mohwald/gandtr', 'hedngan', pretrained=True)

# Image descriptors
gem_vgg16_cyclegan = torch.hub.load('mohwald/gandtr', 'gem_vgg16_cyclegan', pretrained=True)
gem_vgg16_hedngan = torch.hub.load('mohwald/gandtr', 'gem_vgg16_hedngan', pretrained=True)
gem_resnet101_cyclegan = torch.hub.load('mohwald/gandtr', 'gem_resnet101_cyclegan', pretrained=True)
gem_resnet101_hedngan = torch.hub.load('mohwald/gandtr', 'gem_resnet101_hedngan', pretrained=True)

Models initialized this way are pretrained and loaded on GPU by default. If do not want to load pretrained weights, pass pretrained=False; to load the model on e.g. CPU, pass device="cpu".

Important

The expected input of all descriptor models listed above is a batch of normalized images after CLAHE transform. To recommended way how to obtain the image preprocessing transforms (suitable for dataset loader) is demonstrated in the snippet below:

>>> import torch
>>> model = torch.hub.load('mohwald/gandtr', 'gem_vgg16_hedngan')
>>> model.transform
Compose(
    Pil2Numpy()
    ApplyClahe(clip_limit=1.0, grid_size=8, colorspace=lab)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], strict_shape=True)
)

Inference

A single global descriptor can be extracted from an image simply with:

import torch
from PIL import Image

model = torch.hub.load('mohwald/gandtr', 'gem_vgg16_hedngan')
with open("orloj.jpg", 'rb') as f:
    image = Image.open(f).convert("RGB")
inputs = model.transform(image).unsqueeze(0)
with torch.no_grad():
    vec = model(inputs)
print(vec)

The output is 512-sized L2-normalized whitened vector. For the orloj.jpg, you should obtain the vector ending very close to:

        -6.3813e-03, -2.2138e-04,  2.0179e-03,  1.9477e-02,  6.6316e-03,
         1.0677e-02,  1.0847e-02], device='cuda:0')

Installation

  1. Install ffmpeq and graphviz, if you do not have it; ffmpeg is required for OpenCV, graphviz allows to draw network architecture.
sudo apt-get install ffmpeg libsm6 libxext6 graphviz   # Ubuntu and Debian-based system
sudo dnf install ffmpeg libSM graphviz                 # RHEL, Fedora, and CentOS-based system 
  1. Clone this repository: git@github.com:mohwald/gandtr.git && cd gandtr
  2. Install dependencies: pip install -r requirements.txt
  3. (Optional) set environment variables:
    • ${CIRTORCH_ROOT}, where you want to store image data and model weights. All necesarry data for the evaluation and training are automatically downloaded there.
    • ${CUDA_VISIBLE_DEVICES}, set to a single gpu, e.g. export CUDA_VISIBLE_DEVICES=0
  4. Go to mdir/examples

General scenario format and execution

Inside mdir/examples, each experiment can be executed by script perform_scenario.py, that runs yaml scenarios based on this structure:

TARGET:
  1_step:  # first step parameters dictionary
      ...
  2_step:  # second step parameters dictionary
      ...
  ...

Nested dictionary keys can be used in parameters and variables (nested keys are separated by a dot). Bash-style variables are supported within a TARGET, e.g. ${1_step.section.key}. A special variable ${SCENARIO_NAME} denotes the name of the executed scenario (last scenario name, if scenarios are overlayed).

A scenario is executed with perform_scenario.py as:

python3 perform_scenario.py TARGET SCENARIO_NAME_1.yml [SCENARIO_NAME_2.yml]...

Scenarios can overlay, so that all variables of SCENARIO_NAME_1 are replaced by variables from SCENARIO_NAME_2.

Evaluation

All scenarios for evaluation are located inside iccv23/eval.

To evaluate a model from ICCV23 paper, e.g. HED-N-GAN method with GeM VGG16 backbone, run:

python3 perform_scenario.py eval iccv23/eval/hedngan.yml

Warning

Oxford and Paris buildings dataset images are no longer available at the original sources and thus cannot be downloaded automatically. One option is to download images from Kaggle (requires registration). Images should be placed inside ${CIRTORCH_ROOT}/data/test/{oxford5k, paris6k}/jpg, without any nested directories.

To change the GAN generator used in the augmentation, use different scenario with the corresponding generator name. To change the embedding backbone, change eval to eval_r101 to evaluate on GeM ResNet-101. With these options, you should get the following results:

VGG-16 Backbone (eval)

Model Tokyo ROxf RPar
hedngan 88.8 61.1 70.7
cyclegan 90.2 60.7 71.0

ResNet-101 Backbone (eval_r101)

Model Tokyo ROxf RPar
hedngan 91.7 66.6 76.8
cyclegan 92.0 66.8 76.4

Training

All scenarios for training from scratch are located inside iccv23/train.

GAN generator training

To train a GAN generator from scratch, e.g. HED-N-GAN, run:

python3 perform_scenario.py train iccv23/train/hedngan.yml

To change the GAN model, replace the yaml scenario with the scenario corresponding to the model name, e.g. hedngan.yml with cyclegan.yml, etc.

(Optional) After the generator training is finished, arbitrary images can be outputted by the trained generator given a list of image paths from standard input and executing the output target:

python3 perform_scenario.py output iccv23/train/hedngan.yml

Metric learning

To finetune an embedding network for image retrieval, which uses augmentation with HED-N-GAN generator, run:

python3 perform_scenario.py finetune iccv23/train/hedngan.yml

This command will both finetune the embedding model and consequently evaluate it.

To change the backbone used for the finetuning, replace finetune with finetune_r101 for GeM ResNet-101.

Releases

No releases published

Packages

No packages published

Languages