Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datamodule augmentation defaults #1841

Open
calebrob6 opened this issue Feb 1, 2024 · 8 comments
Open

Datamodule augmentation defaults #1841

calebrob6 opened this issue Feb 1, 2024 · 8 comments
Labels
backwards-incompatible Changes that are not backwards compatible datamodules PyTorch Lightning datamodules
Milestone

Comments

@calebrob6
Copy link
Member

Summary

Currently the datamodules divide by 255 by default. This can confuse users who would expect no augmentations by default. We should make this WARNING level clear or not divide by default.

Rationale

No response

Implementation

No response

Alternatives

No response

Additional information

No response

@adamjstewart
Copy link
Collaborator

For the record, torchvision also divides by 255 by default. But yeah, I'm fine with dividing by 1 by default and allowing data modules to override the std dev.

@adamjstewart adamjstewart added the datamodules PyTorch Lightning datamodules label Feb 1, 2024
@calebrob6
Copy link
Member Author

They only divide by 255 if it is uint8, which makes more sense (but still trips me up sometimes).

@adamjstewart
Copy link
Collaborator

I'm fine with changing the default to 1, just need to add a backwards incompatible tag to warn people. And convert all existing data modules to 255 (assuming that this is right). Want to make a PR?

Note that some of @lcoandrade's confusion in #1822 was that Raster Vision divides by 255 automatically but he wasn't sure if TorchGeo did too. Maybe he has opinions on this.

@lcoandrade
Copy link

lcoandrade commented Feb 7, 2024

Hi there!

I think the default behavior should be normalize to [0, 1] according to the data type used. This would avoid confusion.

I also think this information should be clear in the API. I could only check the normalization behavior checking the code on GitHub after @adamjstewart mentioned it to me.

@adamjstewart adamjstewart added the backwards-incompatible Changes that are not backwards compatible label Feb 7, 2024
@adamjstewart adamjstewart added this to the 0.6.0 milestone Feb 7, 2024
@johnnv1
Copy link

johnnv1 commented Feb 10, 2024

Just my two cents here: "Automatic" normalizing data is a really painfully experience when working of mostly of raw satellite data. A simple example, is the sentinel-2, where we have int32 data from the L2A data, but it shouldn't be divided by 65536 to be normalized, but through dividing the data by 10k, then clipping values to be [0-1].

One thing that might help is defaults to the image inputs expected be always 0-1, or maybe allow the user entry within a callable when initializing it to just handle the data -- this is something I always need to hack through to be able to use the libraries

@DimitrisMantas
Copy link
Contributor

DimitrisMantas commented Feb 27, 2024

I can do a PR with per-channel min-max scaling to [0,1] as the default augmentation.

I'm actually using this type of scaling for my own work but my implementation requires the user to have the global mins and maxes specified somewhere, as is currently the case with mean and standard deviation.

I think you should be responsible for your own data and knowing it's value range is not that harsh of a requirement in my opinion, so let me know if this OK and I can proceed with it.

@adamjstewart
Copy link
Collaborator

I can do a PR with per-channel min-max scaling to [0,1] as the default augmentation.

This isn't a good idea. We don't want batch-wise normalization, we want a single normalization parameter for the entire dataset.

Let's default to Normalization(mean=0, std=1) (i.e., subtract zero and divide by 1, i.e., do nothing). Then each data module can change this default. For backwards compatibility, we should probably set std=255 for most datasets, as this seems to be the most common range anyway.

@DimitrisMantas
Copy link
Contributor

Ah, please excuse the confusion; by "per-channel" I meant that if you have an n-band raster, each of its channels gets normalised according to its own minimum and maximum values.

This approach was required in my case because half my bands are aerial unsigned 8-bit imagery whereas the remaining ones represent rasterised LiDAR attributes (e.g., reflectance, slope, etc.), each with its own unit of measurement and value range.

But it’s true that all batches are treated the same way and are scaled according to global parameters, like you mention.

That being said, I think doing nothing may be a more general approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-incompatible Changes that are not backwards compatible datamodules PyTorch Lightning datamodules
Projects
None yet
Development

No branches or pull requests

5 participants