added a simple MLP neural network for wet-dry classification #146

eoydvin · 2023-12-05T10:08:39Z

This uploads the MLP method in issue #145

The network was trained on a few CMLs in Norway with reference being rainfall recorded by nearby disdrometers. I can be run similar to the existing CNN (Polz et al. 2020) by for instance using:

cml["wet_oydvin"] = xr.full_like(cml.tl, np.nan)
cml["wet_p_oydvin"] = xr.full_like(cml.tl, np.nan)

for cmlid in tqdm(cml.cml_id):
    cml_tmp = cml.sel(cml_id=cmlid)
    #cml_tmp=cml_tmp.resample(time="1min").first().to_dataset()
    mlp_out = mlp_wet_dry(
        cml_tmp.isel(sublink_id = 0).tl.values,
        cml_tmp.isel(sublink_id = 1).tl.values,
    )
    
    cml_tmp['wet'] = xr.full_like(cml_tmp.tl, np.nan)
    cml_tmp['wet_p'] = xr.full_like(cml_tmp.tl, np.nan)
    
    cml_tmp['wet_p'].loc[{'sublink_id': 'sublink_1'}] = mlp_out[:, 1]
    cml_tmp['wet_p'].loc[{'sublink_id': 'sublink_2'}] = mlp_out[:, 1] # probab for wet
    cml_tmp['wet'].loc[{'sublink_id': 'sublink_1'}] = np.argmax(mlp_out, axis = 1)
    cml_tmp['wet'].loc[{'sublink_id': 'sublink_2'}] = np.argmax(mlp_out, axis = 1)
    
    cml["wet_p_oydvin"].loc[dict(cml_id=cmlid)]=cml_tmp.wet_p
    cml["wet_oydvin"].loc[dict(cml_id=cmlid)]=cml_tmp.wet

…nute resolution

cchwala · 2023-12-05T10:22:02Z

Thanks @eoydvin 👍

I will add some (probably minor) comments.

One question: Do you have a notebook where you show the application? We do not yet have a notebook that compares the different wet-dry classification methods. But maybe you could start by adding a very simple and minimal notebook for your method. Then we can later add the other methods.

cchwala

Looks good. I only have some minor comments.

cchwala · 2023-12-05T10:27:26Z

pycomlink/processing/wet_dry/mlp.py

+    threshold=None, # 0.5 is often good, or argmax
+):
+    """
+    Wet dry classification using a simple neural network based on channel 1 and channel 2 of a CML


Can you state here some more details or is there a document that you can reference?

E.g. what are the details of the network (MLP, but how many neurons, layers)? What it the sample length, i.e. what is the minimum length of the time series that has to be supplied? Explain if and how the model is applied in a sliding window. How is the NaN handling?

I know that the CNN wet-dry also has very little info in the doc string, but it has the paper with many details. (not saying that we need a paper or somehting similar here...)

I will provide this somehow, Max gave me this idea of publishing it as a technical note somewhere..

I can just be 3-4 lines of text in the doc string. That will be sufficient. But right now the user as absolutely no idea what the function uses. Of course, feel free to write a "technical note" paper any time ;-)

cchwala · 2023-12-05T10:30:43Z

pycomlink/processing/wet_dry/mlp.py

+    trsl_channel_2 : iterable of float
+         Time series of received signal level of channel 2
+    threshold : float 
+        Threshold (0 - 1) for setting event as wet or dry. 


Would be good to have the option of setting this to None and return the continuous output instead of the binary one derived with the threshold. The threshold can easily be applied later and this might make it easier to create a ROC curve where you want to sweep over the thresholds.

update: just saw that None is default, but your docstring is not correctly telling us that ;-)

cchwala · 2023-12-05T10:34:13Z

pycomlink/tests/test_wet_dry_mlp.py

+
+        np.testing.assert_almost_equal(pred[280:293], truth)   
+        np.testing.assert_almost_equal(
+            np.round(pred_raw, decimals=7)[280:293], truth_raw


assert_almost_equal has a decimal kwarg to allow matching on coarser resolution. I guess this is what you do here with the np.round. If possible, please adjust.

cchwala · 2023-12-05T10:40:38Z

pycomlink/processing/wet_dry/mlp.py

+    mlp_pred = np.zeros([x_fts.shape[0], 2])*np.nan
+    indices = np.argwhere(~np.isnan(x_fts).any(axis = 1)).ravel()
+
+    if indices.size > 0: # everything is nan, mlp_pred is then all nan


If I understand correctly, this if-statement is not true if we have an all-NaN in the sample and thus also do not do any prediction. I find the comment missleading, since, if I understand correctly, it explains what happens in the case the if-statement is not true. Can you adjust to make this clearer.

maxmargraf · 2023-12-05T10:57:30Z

Thanks for the PR!
What of the case when only on sublink is available? Is it advisable to duplicate one sublink to get two or should this not be done anyways?

eoydvin · 2023-12-05T11:12:38Z

Thanks for the PR! What of the case when only on sublink is available? Is it advisable to duplicate one sublink to get two or should this not be done anyways?

Yes, duplicate the sublink so that tl from channel_1 is in channel_1 and channel_2

eoydvin · 2023-12-05T13:33:16Z

Thanks @eoydvin 👍

I will add some (probably minor) comments.

One question: Do you have a notebook where you show the application? We do not yet have a notebook that compares the different wet-dry classification methods. But maybe you could start by adding a very simple and minimal notebook for your method. Then we can later add the other methods.

I do actually, it is just a modification of "Basic CML processing workflow.ipynb", it compares the different wet/dry detection methods and includes some thoughts. It will need some review I think.. See uploaded notebook "Wet dry example"

codecov · 2023-12-05T14:34:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (28d54a4) 74.83% compared to head (b7c99d4) 75.44%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   74.83%   75.44%   +0.60%     
==========================================
  Files          29       30       +1     
  Lines        1089     1116      +27     
==========================================
+ Hits          815      842      +27     
  Misses        274      274

Flag	Coverage Δ
unittests	`75.44% <100.00%> (+0.60%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cchwala · 2023-12-05T14:40:05Z

The notebook looks good. Some comments:

The notebook is a bit busy. Maybe you can clean up the initial part that is similar to the example processing notebook, i.e. just state that it is similar and then remove all (or most of) the output.
The wet-dry comparison part is good, but it would be nice to see a second or even third CML, maybe also one that is more challenging due to noisiness.
I am not sure why you do the full processing of all CMLs at the bottom. You do not do an analysis of the results. Maybe you can remove this part at the bottom?
I would keep it simple for now. Sooner or later there should be a comparison and validation with the reference data. But that will take too long now and is out of scope of this PR because it will require some wet-dry performance metrics and plots. I would keep that for later because @maxmargraf is doing similar things for the method intercomparison and we might be able to merge some code from him into pycomlink (or into the upcoming ragali) for wet-dry validation.

mlp.py: - added docstring Wet dry example.ipynb: - Do pre-processing in one cell, refer to "Basic CML processing workflow.ipynb" for more details. - Investigate two interesting CMLs - Shorten the notebook to only compare baselines.

Update: The MLP was retrained using more CMLs and a larger validation dataset. Wet dry example.ipynb: - re run notebook with retrained weights mlp.py: - updated docstring to match retrained architecture model_mlp.keras: - updated weights and architecture test_wet_dry_mlp: - updated to run with new weights

maxmargraf · 2024-01-02T14:50:50Z

Thanks for the changes. Some minor/cosmetic suggestions for the notebook before this PR is ready to be merged from my side:

Could you rename the notebook to be more specific: e.g. "Rain event detection methods"
add a legend to the two plots labeling TL and baseline
add plt.tight_layout() or remove overlapping text
remove blank cells at the bottom of the notebook
Notes could be in markdown

Rain event detection methods.ipynb: - Renamed example notebook to current name - Updated cosmetic suggestions in example notebook

cchwala · 2024-01-03T13:17:30Z

@eoydvin thanks for the update

@maxmargraf feel free to merge when you think it is ready (I will be offline for the next days)

maxmargraf · 2024-01-03T13:48:12Z

Thanks for adding this new method @eoydvin!

added a simple neural network trained at detecting rainfall with 10mi…

b7579a9

…nute resolution

cchwala reviewed Dec 5, 2023

View reviewed changes

cchwala changed the title ~~added a simple neural network~~ added a simple MLP neural network for wet-dry classification Dec 5, 2023

upload example notebook comparing the different wet dry methods

19d4a30

remove np.round in assert_almost equal

e1613f3

eoydvin added 3 commits December 8, 2023 18:23

added more extensive doc string to mlp

51380a4

A simple MLP neural network for wet-dry classification:

b7c99d4

Rain event detection methods.ipynb: - Renamed example notebook to current name - Updated cosmetic suggestions in example notebook

maxmargraf merged commit 94b612c into pycomlink:master Jan 3, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added a simple MLP neural network for wet-dry classification #146

added a simple MLP neural network for wet-dry classification #146

eoydvin commented Dec 5, 2023 •

edited by cchwala

cchwala commented Dec 5, 2023

cchwala left a comment

cchwala Dec 5, 2023

eoydvin Dec 5, 2023

cchwala Dec 5, 2023

cchwala Dec 5, 2023

cchwala Dec 5, 2023

cchwala Dec 5, 2023

maxmargraf commented Dec 5, 2023

eoydvin commented Dec 5, 2023

eoydvin commented Dec 5, 2023 •

edited

codecov bot commented Dec 5, 2023 •

edited

cchwala commented Dec 5, 2023

maxmargraf commented Jan 2, 2024

cchwala commented Jan 3, 2024

maxmargraf commented Jan 3, 2024

added a simple MLP neural network for wet-dry classification #146

added a simple MLP neural network for wet-dry classification #146

Conversation

eoydvin commented Dec 5, 2023 • edited by cchwala

cchwala commented Dec 5, 2023

cchwala left a comment

Choose a reason for hiding this comment

cchwala Dec 5, 2023

Choose a reason for hiding this comment

eoydvin Dec 5, 2023

Choose a reason for hiding this comment

cchwala Dec 5, 2023

Choose a reason for hiding this comment

cchwala Dec 5, 2023

Choose a reason for hiding this comment

cchwala Dec 5, 2023

Choose a reason for hiding this comment

cchwala Dec 5, 2023

Choose a reason for hiding this comment

maxmargraf commented Dec 5, 2023

eoydvin commented Dec 5, 2023

eoydvin commented Dec 5, 2023 • edited

codecov bot commented Dec 5, 2023 • edited

Codecov Report

cchwala commented Dec 5, 2023

maxmargraf commented Jan 2, 2024

cchwala commented Jan 3, 2024

maxmargraf commented Jan 3, 2024

eoydvin commented Dec 5, 2023 •

edited by cchwala

eoydvin commented Dec 5, 2023 •

edited

codecov bot commented Dec 5, 2023 •

edited