Training the model on my own cutom dataset #13

IzzeddinTeeti · 2021-07-31T19:33:39Z

Thanks for uploading such a very interesting code. I am wondering if is it possible to train the model on my own custom dataset? if yes, what is the procedure?

SushkoVadim · 2021-07-31T20:54:31Z

Hi,

Yes, this is possible, and it should be relatively little effort to implement a custom dataloader.

Step 1:
You should create a file CustomDataset.py in the dataloaders folder. Copy-paste all the contents from Ade20k dataloader dataloaders/Ade20kDataset.py to this file.

Step 2:
Based on the properties of your dataset, you should adjust some parameters in the __init__() function. They should be:

opt.label_nc - insert here the number of classes in your dataset (excluding don't care label) (instead of 150).
opt.contain_dontcare_label - this should be True if your dataset has a don't care label, and False otherwise.
opt.semantic_nc - opt.label_nc + 1 if opt.contain_dontcare_label is True, otherwise simply same value as opt.label_nc

Step 3:
The function list_images() should be adjusted to match the structure of your folders. It should return the list of names of images and labels, and a tuple with image and label root folders.

Step 4:,
The new created file should be referenced in /dataloaders/dataloaders.py.
For this, add there the following two lines:

if mode == "custom":
        return "CustomDataset"

to the get_dataset_name() function.

Things to keep in mind:

If your dataset has a don't care label, then for correct computation of losses this class should go in front of all other classes, so it should have id=0.
After implementing it, the program should be called with flag --dataset_mode custom

Let me know whether this works for you.
Regards,
Vadim

IzzeddinTeeti · 2021-07-31T21:49:14Z

@SushkoVadim Thank you so much. I did what you suggested and got the following error

--- Now computing Inception activations for real set ---
--- Finished FID stats for real set ---
Created OASIS_Generator with 74314691 parameters
Created OASIS_Discriminator with 22258904 parameters
Traceback (most recent call last):
File "train.py", line 44, in
loss_G, losses_G_list = model(image, label, "losses_G", losses_computer)
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/mars-beta/izzeddin/OASIS/models/models.py", line 36, in forward
fake = self.netG(label)
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/mars-beta/izzeddin/OASIS/models/generator.py", line 43, in forward
x = self.body[i](x, seg)
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/mars-beta/izzeddin/OASIS/models/generator.py", line 78, in forward
dx = self.conv_0(self.activ(self.norm_0(x, seg)))
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/mars-beta/izzeddin/OASIS/models/norms.py", line 23, in forward
segmap = F.interpolate(segmap, size=x.size()[2:], mode='nearest')
File "/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 3132, in interpolate
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: CUDA error: device-side assert triggered

btw, I am trying to do style transfer from overcast images to rainy images, so I do not have labels images or labels classes, What do you suggest I set the label_nc to?

IzzeddinTeeti · 2021-07-31T22:19:50Z

@SushkoVadim Update, I changed L156 in models.py to target_map[target_map == c] = torch.randint(0,2,(1,)).cuda(), and got the following error:

Created OASIS_Generator with 78461891 parameters
Created OASIS_Discriminator with 22268654 parameters
/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py:1628: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
/home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/_functions.py:64: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
[epoch 0/200 - iter 0], time:0.000

I think it is stuck at iteration 0 !

SushkoVadim · 2021-08-02T12:16:33Z

Hi,

Our project is mainly designed for semantic image synthesis, and we haven't tested it for general image-to-image translation.
For example, the generator always expects to receive a semantic label map to feed to SPADE layers, while the loss function expects label maps for loss computation.

You can in principle set label_nc to zero and adapt code to be used without label maps (no SPADE layers, only binary-cross entropy loss), but please note that it would probably require some re-implementation of our functions.

For your last message:
[epoch 0/200 - iter 0], time:0.000 actually means that the network has successfully passed the first training iteration. Did you wait longer until the next message appear?
By default, the program prints such a message every 1000 iterations, this parameter you can set up manually via --freq_print.

lennart-maack · 2022-01-05T18:04:35Z

Hi,

Yes, this is possible, and it should be relatively little effort to implement a custom dataloader.

Step 1: You should create a file CustomDataset.py in the dataloaders folder. Copy-paste all the contents from Ade20k dataloader dataloaders/Ade20kDataset.py to this file.

Step 2: Based on the properties of your dataset, you should adjust some parameters in the __init__() function. They should be:

opt.label_nc - insert here the number of classes in your dataset (excluding don't care label) (instead of 150). opt.contain_dontcare_label - this should be True if your dataset has a don't care label, and False otherwise. opt.semantic_nc - opt.label_nc + 1 if opt.contain_dontcare_label is True, otherwise simply same value as opt.label_nc

Step 3: The function list_images() should be adjusted to match the structure of your folders. It should return the list of names of images and labels, and a tuple with image and label root folders.

Step 4:, The new created file should be referenced in /dataloaders/dataloaders.py. For this, add there the following two lines:
if mode == "custom":
        return "CustomDataset"
to the get_dataset_name() function.

Things to keep in mind:

If your dataset has a don't care label, then for correct computation of losses this class should go in front of all other classes, so it should have id=0.

After implementing it, the program should be called with flag --dataset_mode custom

Let me know whether this works for you. Regards, Vadim

@SushkoVadim Thanks a lot for your comprehensive answer.

One minor addition to your Step 2:

When copying the content from Ade20kDataset, do not forget to rename the class from Ade20kDataset to CustomDataset

Best regards,
Lennart

leapxcheng · 2022-05-08T04:48:59Z

Excuse me. I wanna ask some questions concerning the use of custom datasets. How to define the opt.label_nc? I found that label_nc in your code is different from the introdcation of datasets in paperwithcode(https://paperswithcode.com/dataset/cityscapes). What's more, if my custom dataset's labels are less than the cityscapes (just 29 total, including the class 'others', So I should set the opt.label_nc = 28, semantic_nc = 29 ?), I should define a specific labelcolormap function like cityscapes?

Best wishes

SushkoVadim · 2022-05-10T07:20:23Z

Hi, yes opt.label_nc should be set to the number of semantic classes, and semantic_nc is set to opt.label_nc+1 in case you have a "don't care", "unlabelled", or "other" label. So in your examples, the numbers 28 and 29 should be correct.

Probably, there exist different versions of Cityscapes, for the one we used we observed 35 classes (link).

The colormap is not needed. The dataloader uses a simple representation of label maps with an integer assigned to each pixel.
Cityscapes also has label maps without any colormap, which are used by our dataloader.

SushkoVadim mentioned this issue May 30, 2022

Train the model with custom dataset #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training the model on my own cutom dataset #13

Training the model on my own cutom dataset #13

IzzeddinTeeti commented Jul 31, 2021

SushkoVadim commented Jul 31, 2021

IzzeddinTeeti commented Jul 31, 2021 •

edited

IzzeddinTeeti commented Jul 31, 2021

SushkoVadim commented Aug 2, 2021

lennart-maack commented Jan 5, 2022

leapxcheng commented May 8, 2022

SushkoVadim commented May 10, 2022

Training the model on my own cutom dataset #13

Training the model on my own cutom dataset #13

Comments

IzzeddinTeeti commented Jul 31, 2021

SushkoVadim commented Jul 31, 2021

IzzeddinTeeti commented Jul 31, 2021 • edited

IzzeddinTeeti commented Jul 31, 2021

SushkoVadim commented Aug 2, 2021

lennart-maack commented Jan 5, 2022

leapxcheng commented May 8, 2022

SushkoVadim commented May 10, 2022

IzzeddinTeeti commented Jul 31, 2021 •

edited