Train the model with custom dataset #27

Ghaleb-alnakhlani · 2022-05-28T22:40:57Z

Hi,

I really don`t know what exactly I need to change in the model in order to run the training.
It will be very helpful if you can tell me how to prepare the dataset for the model.
I wonder how the semantic label should look like?
Let's say we have 3 classes (pedestrian, cow, sheep) and what the target and label folder looks like.
This is how I prepared the dataset (example)
Input (note mask can be in any color other than red, white for example)

Target

Your help is highly appreciated.

SushkoVadim · 2022-05-30T09:00:14Z

Hi,

I think the easiest way to understand the folder structure and the label types is to have a look at one of the commonly used datasets. Taking Ade20k as an example, I would recommend you to download its contents from this link.
You can see that images are stored as normal .jpg RGB images.
For the label maps, they are usually stored as grayscale maps of integer values, where each integer at each pixel location corresponds to the class ID.
For such a dataset structure, our dataloader can be found here: https://github.com/boschresearch/OASIS/blob/master/dataloaders/Ade20kDataset.py

The remaining part for you would be to convert your dataset to a similar structure.
As I could imagine, this would imply the following:

Copy all the files into image/, label/ folders.
Rename all the files in a way that all image-label pairs are named identically
Convert all the label maps to the grayscale integer maps, where each value corresponds to label ID (analogously to Ade20k)
Implement a new dataloader for your dataset, taking Ade20kDataset.py as an example. Also take a look at this discussion.

Hope it helps!

Ghaleb-alnakhlani · 2022-05-30T09:45:37Z

Hi,

@SushkoVadim thank you I have downloaded the Ade20k dataset and I saw the structure it is simple and straight forward.
However I have a few questions if you do not mind. My dataset is a little different in this case here is the current structure

/ dataset
   /sheep_512
   /sheep_mask
   /cow_512
   /cow_mask
   /pedestrian_512
   /pedestrian_mask

Do I have to create train and val for every directory, or can I leave the way it is now.
Which structure is easier the above or the following

/dataset
  /target
    all the images from the 3 classes extracted here
  /label
    all the labels from the 3 classes extracted here

So in this case I will have two folders similar to Ade20K.
When you mentioned grayscale is this considered as grayscale (image below)
And all my images are transparent and they are already resized to 512x512.

Ghaleb-alnakhlani · 2022-05-30T09:55:02Z

I am familiar with Pix2PixHD dataset structure.
I structured the folder to one dataroot, and I there are two folders inside train_A train_B.
!python train.py --label_nc 0 --no_instance --name obj --resize_or_crop none --dataroot /path/to/datasets
And in OASIS it is slightly different --label_nc doesnt function the way it does in Pix2PixHD

SushkoVadim · 2022-05-30T09:58:28Z

Hi,

Both are possible. I think it depends on whether you plan to train one model for the whole dataset with all classes, or whether you want to have a separate model for each of your classes. Since there are no images having both "sheep" and "cow" classes, it makes sense to have separate models for different classes (but this is just an assumption, it totally depends on your experiment design).
Then, the structure is the first one, or the following:

/ dataset_sheep
   /sheep_512
   /sheep_mask
/ dataset_cow
   /cow_512
   /cow_mask

For the masks, I would suggest a simple test:

# --- read label map ---#
label = Image.open("your_label_map.png")
label = TR.functional.to_tensor(label)
print(label)

In your test, you should see the elements 0 and 1 (e.g., not 255). Then, 0 would correspond to background, and 1 for the cow.

Yes, our structure is a bit different from Pix2PixHD, because their repository is for image-to-image translation, while ours is for semantic image synthesis. The closer repository for us is the one of SPADE (https://github.com/NVlabs/SPADE).

Ghaleb-alnakhlani · 2022-05-30T10:07:05Z

My bad I forgot to mention I want to train one model for the whole dataset. How the structure should look like?
Thank you for the tip on testing the mask, that is very helpful.
I am also a little confused. What is the main difference between image-to-image translation and semantic image synthesis? in your opinion my dataset would fall into which type?

SushkoVadim · 2022-05-30T10:23:48Z

In your case, if you want to have a single model for all classes, you should organize the first option:

/dataset
  /target
    all the images from the 3 classes extracted here
  /label
    all the labels from the 3 classes extracted here

Don't forget to in all the masks background should be class 0, cow 1, sheep 2, so it has to be consistent across all the masks.

Image-to-image translation translates images to images, e.g. zebras to horses. Semantic image synthesis is a special class of Image-to-image translation that uses masks as input, not images. Your dataset type is ok for both tasks.

Ghaleb-alnakhlani · 2022-05-30T10:31:51Z

Thank you that was very helpful I think in my case I can name it as semantic image synthesis. I am using label mask as input.
Sorry I did not fully understand the point Don't forget to in all the masks background should be class 0, cow 1, sheep 2, so it has to be consistent across all the masks.
Using the script mentioned above I can confirm that the mask is between 0 background and 1 foreground (no 255)
It would be very helpful if you can elaborate it with an example
And what should I choose for --label_nc ?

SnowdenLee · 2022-06-01T14:04:21Z

@Ghaleb-alnakhlani As you mentioned you want to train one model for the whole dataset, so the semantics labels should be consistent across the whole dataset not just for one image. For example, if you define background-0, cow-1, sheep-2. For the single image contains cow, the label is backgound-0, cow-1; and for the image contains sheep, the label should be backgound-0, sheep-2. Only labelling foreground is not enough, specific class label is needed here.

Ghaleb-alnakhlani · 2022-06-01T14:19:38Z

Hi @SnowdenLee thank you. Can you please provide an example of how to prepare my data according to what you have mentioned if you know where this is implemented before?
Another thing I noticed is there a way to keep the images unchanged, the input of the real image is (transparent) however the dataloader changes the images to RGB resulting in a black background added to the transparent image? What do I need to change in order to avoid that?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train the model with custom dataset #27

Train the model with custom dataset #27

Ghaleb-alnakhlani commented May 28, 2022 •

edited

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022 •

edited

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022 •

edited

SnowdenLee commented Jun 1, 2022

Ghaleb-alnakhlani commented Jun 1, 2022

Train the model with custom dataset #27

Train the model with custom dataset #27

Comments

Ghaleb-alnakhlani commented May 28, 2022 • edited

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022 • edited

SushkoVadim commented May 30, 2022

Ghaleb-alnakhlani commented May 30, 2022 • edited

SnowdenLee commented Jun 1, 2022

Ghaleb-alnakhlani commented Jun 1, 2022

Ghaleb-alnakhlani commented May 28, 2022 •

edited

Ghaleb-alnakhlani commented May 30, 2022 •

edited

Ghaleb-alnakhlani commented May 30, 2022 •

edited