Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[best practices] Processing by orbit #81

Open
c00kiemon5ter opened this issue Dec 10, 2020 · 11 comments
Open

[best practices] Processing by orbit #81

c00kiemon5ter opened this issue Dec 10, 2020 · 11 comments

Comments

@c00kiemon5ter
Copy link

c00kiemon5ter commented Dec 10, 2020

Hello, I want to ask a question regarding best practices around processing tiles and specifically focusing on tile-orbits.

We made a small run of MAJA and processed some L1C products and got great results. Processing a different tile we got less accurate results. We went back to see what had happened and what came to our mind is the mix of the orbits. The first experiment was on a tile that happened to always have products be on the same orbit, but the second tile had mixed orbits.

Sen4CAP, using MAJA as the processor to produce L2A products, internally separates tiles per orbit and uses a tile of the same orbit as the previous product.

With those observation, would it be better to separate processing per tile and orbit or let Maja pick any tile as a previous product, regardless of the tile orbit?

We are running a test case for that second experiment, but this time with separate L1C products per orbit. I will come back with some results and new observations, but your experience and observations would certainly be valuable here.

PS: I am using MAJA v4.2.1 through StartMaja

On v4.2.1, StartMaja will just get the closest product to be used as the previous. This seems like a good place to implement a better heuristic that can pick a previous product based on some strategy (ie, the closest of the last 4 with the least cloud-coverage.)

https://github.com/CNES/MAJA/blob/4.2.1/StartMaja/Chain/Workplan.py#L259-L266

@olivierhagolle
Copy link
Contributor

Hi Ivan,
thanks for sharing this feedback, which makes sense.
I am just wondering if your appreciation is related to the atmospheric correction of to cloud detection. Our own experience is that merging different swaths improves the cloud detection in cloudy regions because of the enhanced revisit, improves the aerosol detection in summer, but degrades it in winter, because our directional correction model is quite bad when the sun is low. We are implementing a better directional correction in MAJA 4.3.

Your own validation results will be very welcome in this regard, thanks a lot !

It is quite easy to run start-maja after separating the swaths in different folders before launching start-maja, but we should think of implementing what you suggest as an option. Our development pipeline is quite full, so it might take a while.

Best regards,
Olivier

@c00kiemon5ter
Copy link
Author

c00kiemon5ter commented Dec 10, 2020

Hi Olivier and thanks for the swift reply,

you are right, I should have been more specific. Our remarks refer to cloud detection accuracy.

merging different swaths improves the cloud detection in cloudy regions because of the enhanced revisit

This is reasonable and we want to take advantage of this. However, this case is about a tile where most of the time the different orbits come one after the other when sorted by date-time (R050 > R093 > R050 > R093 > R050 > R050 > R093 > etc). The important part is that one of the orbits has half the area as NODATA and the other is full.
Given such a case, when MAJA runs in NOMINAL mode and is processing a tile in the orbit that has full image coverage, the previous product that it will use, will be one with half the image as NODATA. Wouldn't this be a problem?

(I see that MAJA in NOMINAL mode uses a single previous product. Could it use more?)

It is quite easy to run start-maja after separating the swaths in different folders

We already have an L1C set separated in different ways (year, tile, etc) but now we added the orbit, too. We are now running MAJA for each orbit separately and will come back with results.

PS: The specific tile we are experimenting with is 34SFJ and the two orbits are R050 and R093.

@olivierhagolle
Copy link
Contributor

To answer your question, MAJA uses only one previous level 2 product, but this product conveys information from a lot of previous images. We use a composite image at 240m resolution, which for each pixel contains the most recent cloud free observation, and it is used as a reference to compare with the new image to classify. This composite is updated at each new image with the new cloud free pixels, and stored in the level 2 (the private part). As a result, if each half of the image is observed on a different swath, it should not be an issue.

@ChristinaKarakizi
Copy link

ChristinaKarakizi commented Dec 13, 2020

Dear Olivier,

catching up to my colleague @c00kiemon5ter, we would like to inform you that our experiments on tile 34SFJ (for year 2019), indicated that using both orbits at the same MAJA run, indeed gave better results for most scenes, concerning cloud detection.

However the first remark that Ivan pointed out, referring to much better cloud detection results on another tile (34SEG) is still valid and after further investigation we think that this can be directly associated to differences of the image's landscape.

In particular 34SFJ includes the largest agricultural plain of Greece, i.e., great heterogeneity in dark/bright values throughout the year and frequent alternation between those in just a few meters. MAJA 4.2.1 gives us a lot of false positive pixels classified as clouds, that are actually clear (cloud-free) observations of crop parcels, even in (almost) cloud-free images.

e.g. True color composite of T34SFJ_20191016 & detected clouds in red (cloud mask >0) from CLM_R1 product of MAJA 4.2.1.
image

What is also interesting to point out is that the false-positive issue concerning agricultural areas is not observed in that extent in some results we produced using MAJA 3.2.2 version (via Sen4Cap). It seems that older versions produce cloud mask at a coarser resolution and though this comes with a worse and expanded delineation result of the cloud figures, at the same time there are significantly fewer issues concerning these false positives (salt & pepper) pixels detected as clouds.

e.g. Cloud masks from versions 3.2.2 & 4.2.1 on a True color composite of T34SFJ_20190718.
image

We read here (https://labo.obs-mip.fr/multitemp/maja-4-2-is-available-and-open-source/) that 4.2 version gives the possibility of processing cloud detection at 120m (instead of 240m), is this the case when using ''start-maja'' for 4.2.1 version? We have also observed a significant difference in the size of CLM products between 4.2 and 3.2 versions.

Thank you in advance.

Best Regards,
Christina

@olivierhagolle
Copy link
Contributor

Thanks Christina.
Very interesting, it is a case we had not observed so far. It might be indeed a consequence of the increase or resolution in MAJA 4.2 cloud mask, but we will also need to check if all the parameters are OK. Maybe the change of resolution requires a different tuning of parameters.
Could you please provide the command line used so that we can try to reproduce that case and analyze what happens ?
Our analysis, with the help of @jerome-colin, will take some time, don't expect an answer before 2021.
Best regards,
Olivier

@c00kiemon5ter
Copy link
Author

We run MAJA v4.2.1 (official binary release) on Ubuntu 18.04 (a Vagrant VM), in a dedicated miniconda environment setup with python=3.6.8 gdal=2.3.3 and scipy=1.3.0.

The command we use after having activated the conda environment is the following:

$ cat run.sh
#!/bin/sh

set -e

PATH="/opt/maja/v4.2.1/bin:/opt/miniconda/bin:${PATH}"
. "/opt/miniconda/etc/profile.d/conda.sh"

conda activate maja
startmaja --type_dem eudem -y -f config.ini -t 34SFJ -d 2018-12-01 -e 2020-01-31 
$ cat config.ini
[Maja_Inputs]
exeMaja=/opt/maja/v4.2.1/bin/maja
repWork=/data/work
repGipp=/data/gipp
repMNT=/data/dtm
repL1=/data/L1C
repL2=/data/L2A
#repCAMS=/data/CAMS

[DTM_Creation]
repRAW=/data/dtm/raw
repGSW=/data/dtm/gsw

I will later upload the vagrant recipe and let you know, so that you can replicate the exact environment.


It might be indeed a consequence of the increase or resolution in MAJA 4.2 cloud mask,

Does this mean that MAJA v4.2 uses 120m resolution to create the cloud masks by default (while previous releases -v3.2.2 in particular- used 240m)?

but we will also need to check if all the parameters are OK.

If there is something we can help with, let us know and we can have a run with different params/settings.

@jerome-colin
Copy link
Collaborator

Dear @c00kiemon5ter
I'm not sure it's related to your specific environment. I just launched a run for 34SFJ on our own hpc config. @olivierhagolle and I will investigate and let you know asap.
Thanks for your help, it's really useful for us to get user feedback !

@ChristinaKarakizi
Copy link

Olivier and Jerome Greetings from Athens,
and wishes for a prosperous 2021!

May I ask if you had any news regarding the false positives overestimation for the higher resolution Cloud Mask for v 4.2 ?

Thank you once again,
Christina

@jerome-colin
Copy link
Collaborator

jerome-colin commented Feb 1, 2021

Dear Christina,
we made some test, in particular on the tiles your mentioned, and found that adjusting the minimum threshold on reflectance variation with time in the blue could reduce the number of false positives for cloud detected in multitemporal mode at 120m. You can find this parameter in the GIPP directory, in the files named *L2COMM*.EFF.

At line 234:
<Min_Threshold_Var_Blue>0.016</Min_Threshold_Var_Blue>
A value of 0.018 seems more adequate for simulations at 120m in your area of interest, but I refrain from making it a generality. If you want to test it, be sure to modify both S2A and S2B L2COMM files.

In the meantime, we found that some other parameters in the GIPPs are not well adapted to the 120m resolution (which affect the reflectances rather than the cloud mask). For now, we'd recommend you to keep processing products at 240m, and we'll reset this 240 resolution as the default one until the release of Maja 4.4 (we will have fixed all this issues until then).

Hope this help,
With our best wishes for this New Year !
Jerome

@c00kiemon5ter
Copy link
Author

Hello @jerome-colin

(also, following up from #82 (comment))
from what I see in our installation of MAJA-4.2.1 (manually built), we already have the L2CoarseResolution set to 240.

$ pwd
[reducted]/tools/maja/v4.2.1/install/maja/4.2.1

$ grep -RF "L2CoarseResolution" ./lib/python/StartMaja/userconf
./lib/python/StartMaja/userconf/MAJAUserConfig_SENTINEL2_MUSCATE.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_LANDSAT8_MUSCATE.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_VENUS_MUSCATE.xml:    <L2CoarseResolution>100</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_SENTINEL2_TM.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_LANDSAT8.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_VENUS.xml:	    <L2CoarseResolution>100</L2CoarseResolution>
./lib/python/StartMaja/userconf/MAJAUserConfig_SENTINEL2.xml:	    <L2CoarseResolution>240</L2CoarseResolution>

$ grep -RF L2CoarseResolution ./etc/conf/
./etc/conf/user/MAJAUserConfig_SENTINEL2_MUSCATE.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_LANDSAT_MUSCATE.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_LANDSAT8_MUSCATE.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_VENUS_MUSCATE.xml:	    <L2CoarseResolution>100</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_SENTINEL2_TM.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_LANDSAT8.xml:	    <L2CoarseResolution>240</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_VENUS.xml:	    <L2CoarseResolution>100</L2CoarseResolution>
./etc/conf/user/MAJAUserConfig_SENTINEL2.xml:	    <L2CoarseResolution>240</L2CoarseResolution>

Only MAJAUserConfig_VENUS.xml and MAJAUserConfig_VENUS_MUSCATE.xml have the L2CoarseResolution set to 100 (not 120).

The same holds for this repository, searching for L2CoarseResolution in XML files:
https://github.com/CNES/MAJA/search?l=XML&q=L2CoarseResolution

So, I think we are already processing at 240m resolution. Are we missing some other configuration option?

@jerome-colin
Copy link
Collaborator

Hi @c00kiemon5ter,
Actually, the L2CoarseResolution is set in S2product.py and the userconf L2CoarseResolution is no longer in use since the release 4.2.1. This wasn't clear to me until I tried to jump from 240m to 120m for comparison purposes, and opened the
following issue.

You may still guess the resolution of your DTM from the size of some files. As an example, for your 34SFJ tile, the file S2__TEST_AUX_REFDE2_34SFJ_3001_ALC.TIF is around 1.6M at 120m, and 411K at 240m.

That's obviously not satisfying, and we'll modify the code to let the user set this coarse resolution explicitly (eg. add a --L2CoarseResolution option to startmaja), probably for release 4.4.

Jerome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants