Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud mask issue - C01 vs C02 #407

Open
mcuttler opened this issue May 23, 2023 · 16 comments
Open

cloud mask issue - C01 vs C02 #407

mcuttler opened this issue May 23, 2023 · 16 comments
Labels
enhancement New feature or request

Comments

@mcuttler
Copy link

Hey @kvos

We were recently looking at beach in the Perth metro area and noticed a similar cloud mask issue that we were seeing as part of issue #5 .

We tried just using the 'cloud_mask_issue:True' in the settings, but couldn't resolve our issue. I finally figured out that it appears to be related to which collection ('C01' vs 'C02') the images were downloaded from. When we used 'C02', the cloud_mask_issue switch didnt' seem to do work, but by switching to 'C01' it worked straight away (see images).

C01
2020-02-03-02-11-43_RGB_L8_C01

CO2
2020-02-03-02-11-43_RGB_L8_C02

It looks like there's already some within the SDS_preprocess.create_cloud_mask to try to make it work for both C01 and C02, but haven't had a chance to look into why this isn't working for C02. Just wanted to open this up in case others have similar issues, I'll try to dig into this more as I get time.

@kvos
Copy link
Owner

kvos commented May 24, 2023

hey Mike, thanks for the feedback on this one, that's interesting , not sure what is happening there. The cloud masking algorithm (CFMASK) is known to struggle with very bright sand pixels that's why I implemented the cloud_mask_issue setting (from original feedback that you gave me!), so I am not sure why it's not the same for C02. will investigate, let me know if you find something.

@kvos
Copy link
Owner

kvos commented Jun 7, 2023

hey @mcuttler , any progress on this? let me know if you'd like me to have a look too, if so you will need to send me the inputs to be able to reproduce the problem, thanks!

@mcuttler
Copy link
Author

hey @kvos, haven't had a chance to look into this in more detail, besides identifying the sections of code to look into further.
Attached is just the basic 'run' script as a .txt. I've updated the polygon details to be an area in Perth metro (City Beach) that one student is looking into.

As far as I got was figuring out that swapping 'C02' to 'C01' resolved the cloud mask issue - i.e., the in-built code you have worked. I was having a play around with how the SDS_preprocess.create_cloud_mask handles 'C02' as this seems to be where the issue migth be, and got some initial results by tweaking some of the 'morphology' settings (lines 334-337 in SDS_preprocess), but it seemed to get overwritten in a next stage of the code.

I might have some time this week to keep trouble-shooting, let me know if you get it !

test_cloud_mask_issue.txt

@DanieTheron
Copy link
Contributor

DanieTheron commented Sep 7, 2023

Hi @kvos and @mcuttler, so I have observed a similar occurrence for the C01 and C02 sets of images with erroneous masked bands at a few sites I'm studying. The proposed method for the cloud_mask_issue assumes that the width of this masked band is not very wide and therefore the morphology.square[ ] process typically removes these bands - for the C01 scenario. The reason why the C02 scenario does not get removed that well is because of the USGS labelling of the pixels (captured in the variable im_QA). See the picture below as a reference, explanation to follow: Note that arbitrary colours were used for different values.

C01vsC02vsErrorvsActual

Firstly, the current algorithm in SDS_preprocess line 334-335 creates a "band of True values" for all the cloud_values in im_QA. Here is the line:

# find which pixels have bits corresponding to cloud values
cloud_mask = np.isin(im_QA, cloud_values)

So the algorithm works reasonably okay for C01 because the total region of values corresponding to cloud values is not very wide. This is not the case for C02. As seen in the image the region that is masked for C02 is very wide. The band of cloud_values is much wider for C02 and consequently, due to lines 334-335 this whole band is set to "TRUE". So when the morphology. square[ ] algorithm is performed this whole region is included since it is a massive block of pixels, similarly the morphology.remove_small_objects[ ] does not set the values to "FALSE" because these regions are much larger in total area than the current proposed value of 100 pixels.

However, when looking at the spread of values (refer to the image C02) it is seen that the C02 values from cloud_values form distinct narrow bands, similar to C01 earlier. Instead of applying lines 334-335 which creates a whole massive region of TRUE values, I traversed over cloud_values to apply morphology.square[ ] and morphology.remove_small_objects[ ] to each individual band of values.

Here is the ORIGINAL CODE:

# find which pixels have bits corresponding to cloud values
cloud_mask = np.isin(im_QA, cloud_values)
# remove cloud pixels that form very thin features. These are beach or swash pixels that are
# erroneously identified as clouds by the CFMASK algorithm applied to the images by the USGS.
if sum(sum(cloud_mask)) > 0 and sum(sum(~cloud_mask)) > 0:
    cloud_mask = morphology.remove_small_objects(cloud_mask, min_size=40, connectivity=1)

    if cloud_mask_issue:
        elem = morphology.square(6) # use a square of width 6 pixels
        cloud_mask = morphology.binary_opening(cloud_mask,elem) # perform image opening
        # remove objects with less than min_size connected pixels
        cloud_mask = morphology.remove_small_objects(cloud_mask, min_size=100, connectivity=1)

Here is my SUGGESTED CODE (traversing over cloud_values):

# find which pixels have bits corresponding to cloud values
cloud_mask = np.isin(im_QA, cloud_values)
# Remove cloud pixels that form very thin features. These are beach or swash pixels that are
# erroneously identified as clouds by the CFMASK algorithm applied to the images by the USGS.
if sum(sum(cloud_mask)) > 0 and sum(sum(~cloud_mask)) > 0:
    cloud_mask = morphology.remove_small_objects(cloud_mask, min_size=40, connectivity=1)

if cloud_mask_issue:
    cloud_mask = np.zeros_like(im_QA, dtype=bool)
    for value in cloud_values:
        cloud_mask_temp = np.isin(im_QA, value)         
        elem = morphology.square(7) # use a square of width 7 pixels
        cloud_mask_temp = morphology.binary_opening(cloud_mask_temp, elem) # perform image opening            
        cloud_mask_temp = morphology.remove_small_objects(cloud_mask_temp, min_size=100, connectivity=1)
        cloud_mask = np.logical_or(cloud_mask, cloud_mask_temp)

So the suggested code creates an array of FALSE values and only adds the clouds (aka TRUE values) if it was not removed with morphology.square[ ] or morphology.remove_small_objects[ ]

@kvos
Copy link
Owner

kvos commented Dec 18, 2023

hi @DanieTheron , apologies for only seen this now, 3 months later, I completely missed it amongst all the issue. Thanks for the great analysis and providing a solution. Can you make a pull request with your new cloud masking function? I think it makes a lot of sense to do the morphological operation by value instead of with the combined region.
@mcuttler

@kvos kvos added the enhancement New feature or request label Dec 18, 2023
DanieTheron added a commit to DanieTheron/CoastSat_CloudMaskIssueFix that referenced this issue Feb 26, 2024
Hi,
This commit proposes a solution to the issue kvos#407 "cloud mask issue - C01 vs C02 kvos#407" by traversing over cloud_values. More details provided in kvos#407 .
@kvos
Copy link
Owner

kvos commented Apr 9, 2024

hi @DanieTheron , has this been implemented yet and merged into the master?

@DanieTheron
Copy link
Contributor

Hi @kvos, I am not really that informed about how GitHub works regarding merging into the master, but I tried doing something like that using a "commit". How do I "implement and merge it into the master" as per your message?

@kvos
Copy link
Owner

kvos commented Apr 23, 2024

hi @DanieTheron , you can create a new Pull Request and propose to merge your branch in the master branch. Then I will review the changes and approve the merge. Make sure you pull the latest changes from the master before creating the pull request. Thanks for your contribution to the open-source code!

@DanieTheron
Copy link
Contributor

Hi @kvos, I have added a Pull Request for your review.

@theocatsu
Copy link

Hi @DanieTheron, is the code safe to use now ? I'm having trouble with cloud masking on my study site currently and this could help, thanks !

@DanieTheron
Copy link
Contributor

Hi @theocatsu, yes I have implemented this "enhanced" code for many sites (100+) in South Africa, and it works great.

@theocatsu
Copy link

Hi @DanieTheron, thank you for your answer, i tried it in one of my site of study and it works great on landsat images ! Unfortunately, in another study site, most of the shore is blacked out by cloud masking pixels even with this correction, it's actually a site where only S2 images are available as it's not monitored a lot, and i would like to train a classifier on it. By any chance do you know a way to "shut down" the cloud masking to prevent it from displaying the results in my figure :
Tromelin

Thanks for your hindsights and your code ;)

And Thanks @kvos for your work !

kvos added a commit that referenced this issue Apr 26, 2024
@DanieTheron
Copy link
Contributor

DanieTheron commented Apr 26, 2024

Hi @theocatsu, a couple of things to note, the cloud mask issue noted is for Landsat imagery (Collection 1 and 2), so therefore it is expected that it won't have such a great impact on your case (only S2 images) with the default code. If I understand your message correctly you want to remove the cloud mask in total. Two ways without changing the master code too much you could use to achieve this is by minor changes in the SDS_preprocess.py script:

  1. Making all the cloud_mask values set to "false" for all images, this is a form of hard-coding and override. Note that you will then have to manually check afterwards that the images you include in your study do not have clouds affecting your results. Depending on which version of CoastSat you are using, try adding the following line of code just before the return cloud_mask line in the script :
def create_s2cloudless_mask(cloud_prob, s2cloudless_prob):
...
...
cloud_mask = np.zeros_like(cloud_prob, dtype=bool) #This line overrides the cloud mask
return cloud_mask

AND

def create_cloud_mask(im_QA, satname, cloud_mask_issue, collection):
...
...
cloud_mask = np.zeros_like(im_QA, dtype=bool) #This line overrides the cloud mask
return cloud_mask
  1. Another approach that does not override the cloud_mask array as in (1), but instead performs a more harsh removal of potentially erroneous clouds is by changing (increasing) the default parameters of especially the min_size parameter in morphology.remove_small_objects() and potentially also increasing morphology.square(6) .

Let me know if you don't get it.

@kvos
Copy link
Owner

kvos commented Apr 26, 2024

to add to this good suggestion from @DanieTheron , a simpler way to deactivate the s2cloudless mask is by setting the s2cloudless probability to 100 in the settings.

if the clouds persist, it means that it's not s2cloudless but the standard cloud masking that is the issue. you can set the cloud_mask_issue setting to True and if it's still there, go in the function and do what @DanieTheron suggested, increase the size of the binary element from 6 to 20 and all cloud pixels will be gone.

it's a classic that cloud masks have false positives on white-sand islands, don't worry you'll find a way

@kvos
Copy link
Owner

kvos commented Apr 26, 2024

Hi @kvos, I have added a Pull Request for your review.

thanks! I have merged into the master.

@theocatsu
Copy link

Dear @DanieTheron and @kvos, it was indeed the standard cloud masking that had an issue with my case ! After overriding it, pixels vanished, great way to solve this problem thanks a lot for your help !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants