Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bringin own data #222

Open
manapshymyr-OB opened this issue Aug 8, 2021 · 14 comments
Open

Bringin own data #222

manapshymyr-OB opened this issue Aug 8, 2021 · 14 comments

Comments

@manapshymyr-OB
Copy link

manapshymyr-OB commented Aug 8, 2021

Hello! @daniel-j-h Thank you for your project.

I have some questions.
1 Can we bring own dataset(with labels) for example: https://github.com/phelber/EuroSAT.
2. If we can bring this data, should we convert them into PNG or robosat will work with multi spectral data?
3. How we should select the zoom level? I intend to use Sentinel - 2 data for building detection and not sure how to figure out appropriate zoom level.
Thanks!

@daniel-j-h
Copy link
Collaborator

You can bring your own dataset; for multi-spectral data you will need something like #138

Sentinel-2 resolution should be native at z14 if I'm not mistaken, so use that or lower.

Please read the note in the readme, though, this project is no longer maintained, developed, or in any other form active

https://github.com/mapbox/robosat/blob/cbb1c73328183afd2d6351b7bfa3f430b73103ea/README.md

@manapshymyr-OB
Copy link
Author

You can bring your own dataset; for multi-spectral data you will need something like #138

Sentinel-2 resolution should be native at z14 if I'm not mistaken, so use that or lower.

Please read the note in the readme, though, this project is no longer maintained, developed, or in any other form active

https://github.com/mapbox/robosat/blob/cbb1c73328183afd2d6351b7bfa3f430b73103ea/README.md

@daniel-j-h Thanks a lot for your response. One more, question: I want to run the training only Bayern region and for this I would like to use multiple scenes of Sentinel 2 (with different dates of acquisition). However as far I understood, when tiling the scene we will get images with x, y named, for example: folder name with x coordinate, and inside this folder y coordinate images. How can I add multiple images (with time difference) for training dataset? I think I can not change the name of tiles, am I right?

@daniel-j-h
Copy link
Collaborator

daniel-j-h commented Aug 10, 2021 via email

@manapshymyr-OB
Copy link
Author

@daniel-j-h if I use multiple images with different dates of the same scene (for example Bayern region with 2019 and 2020 data), will this change anything?

@daniel-j-h
Copy link
Collaborator

Should work, too! Ideally you should strive for a balanced dataset and also make sure to first shuffle your dataset, and then split into train / validate / test.

@manapshymyr-OB
Copy link
Author

@daniel-j-h last question for now, can my data be multi-channel? because currently, I am converting tif images into PNG. Concerned if I really need this step or not...

@manapshymyr-OB
Copy link
Author

@daniel-j-h By

Should work, too! Ideally you should strive for a balanced dataset and also make sure to first shuffle your dataset, and then split into train / validate / test.

What do you mean by balanced and shuffle?

@daniel-j-h
Copy link
Collaborator

For multi-spectral data you will need something like #138

By balanced I & shuffle I mean

  • don't have one month from 2019 and one year from 2020
  • don't train on 2019 and validate on 2020

try to have e.g. 50% from 2019 and 50% from 2020, then shuffle those, and train / validate / test on subsets of those.

Good luck!

@manapshymyr-OB
Copy link
Author

For multi-spectral data you will need something like #138

By balanced I & shuffle I mean

* don't have one month from 2019 and one year from 2020

* don't train on 2019 and validate on 2020

try to have e.g. 50% from 2019 and 50% from 2020, then shuffle those, and train / validate / test on subsets of those.

Good luck!
Thanks for your quick responses.
I am a bit confused regarding the multichannel processing.
Currently I am doing following steps:
I have sentinel data and creating tiff image with B4, B3, B2, B8.
Then, tiling this with 14 zoom level, but as a result I am getting .png tiles. Is these steps are correct?

Can you please give some introduction steps (i need to create dataset for robosat) for multichannel data?

@daniel-j-h
Copy link
Collaborator

daniel-j-h commented Aug 15, 2021 via email

@manapshymyr-OB
Copy link
Author

For multi-spectral data you will need something like #138

By balanced I & shuffle I mean

  • don't have one month from 2019 and one year from 2020
  • don't train on 2019 and validate on 2020

try to have e.g. 50% from 2019 and 50% from 2020, then shuffle those, and train / validate / test on subsets of those.

Good luck!

@daniel-j-h Hello again. Now I have 100 Sentinel scenes (50 for 2019 & 50 for 2020) as geotiff. Now I should tile them using gdal2tiles or rio tiler, right?

@daniel-j-h
Copy link
Collaborator

daniel-j-h commented Aug 19, 2021 via email

@manapshymyr-OB
Copy link
Author

@daniel-j-h Thanks for your reply. I know that this project is not supported anymore. Anyway, I want to try and see results. In the case of tiling GEOTif images, the result will be png images with the following structure z/x/y, where z - zoom, x, y tile numbers. Therefore rs subset this directory structure to filter. As far as I understood the rs subset will filter out tiles that are not included in the building.tiles And this is based on z/x/y.*, right? How should I process if I gave 4 repeating scenes (so the images will have repeating x/y/z)? How should I process dataset creation steps? Can you please give me some guide?

As you previously mentioned to offset y and x but then it will not be the same as tiles in the building.tiles....

For training I don't think we actually use the z/x/y tile coordinates, only for prediction and merging I think. If that's true, you could just work around by changing your tile z/x/y during training, e.g. add an offset or random ints to z/x/y, should work ™️
On August 10, 2021 3:24:14 AM UTC, manapshymyr-OB @.> wrote: > You can bring your own dataset; for multi-spectral data you will need something like #138 > > Sentinel-2 resolution should be native at z14 if I'm not mistaken, so use that or lower. > > Please read the note in the readme, though, this project is no longer maintained, developed, or in any other form active > > https://github.com/mapbox/robosat/blob/cbb1c73328183afd2d6351b7bfa3f430b73103ea/README.md
@.
Thanks a lot for your response. One more, question: I want to run the training only Bayern region and for this I would like to use multiple scenes of Sentinel 2 (with different dates of acquisition). However as far I understood, when tiling the scene we will get images with x, y named, for example: folder name with x coordinate, and inside this folder y coordinate images. How can I add multiple images (with time difference) for training dataset? I think I can not change the name of tiles, am I right?

@daniel-j-h
Copy link
Collaborator

@daniel-j-h Thanks for your reply. I know that this project is not supported anymore. Anyway, [.. wall of text here]

no-support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants