[WIP] Split PrepPipeline into separate methods, make final interpolation optional #99

a-hurst · 2021-07-03T04:28:19Z

PR Description

(Eventually) closes #73. Currently, this PR:

Adds separate remove_line_noise and robust_reference methods to PrepPipeline.
Makes final bad channel interpolation optional.
Removes a bunch of unused matplotlib-based MATLAB comparison code in the PrepPipeline unit tests.
Removes support for notch filter types other than spectrum_fit, since you can now just filter line noise however you want and then run PrepPipeline.robust_reference if you want to use another kind. This vastly simplifies the demands on the filter_kwargs API and should make the PrepSettings filtering options much easier to document.

There's still definitely more cleanup to do, but I figured it'd be good to get the core of this up for review sooner than later!

Merge Checklist

the PR has been reviewed and all comments are resolved
all CI checks pass
(if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
(if applicable): bug fixes, new features, or API changes are documented in whats_new.rst

a-hurst · 2021-07-03T04:33:12Z

Whoops, looks like one of the examples relies on an undocumented attribute I removed for the sake of RAM (self.EEG_new). Will address that tomorrow.

codecov-commenter · 2021-07-03T04:51:20Z

Codecov Report

Merging #99 (0688a5c) into master (ec44384) will decrease coverage by 0.35%.
The diff coverage is 95.34%.

❗ Current head 0688a5c differs from pull request most recent head 6ff2755. Consider uploading reports for the commit 6ff2755 to get more accurate results

@@            Coverage Diff             @@
##           master      #99      +/-   ##
==========================================
- Coverage   99.04%   98.68%   -0.36%     
==========================================
  Files           7        7              
  Lines         734      762      +28     
==========================================
+ Hits          727      752      +25     
- Misses          7       10       +3

Impacted Files	Coverage Δ
pyprep/reference.py	`97.88% <90.00%> (-1.30%)`	⬇️
pyprep/prep_pipeline.py	`98.59% <100.00%> (-1.41%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec44384...6ff2755. Read the comment docs.

sappelhoff

thanks, it's great that we don't need the matplotlib stuff in the tests anymore. Also reminds me of #15

sappelhoff · 2021-07-15T08:25:01Z

It is possible that I introduced these conflicts 😖 it's because I worked on dbe2062 before pulling master ... the pulled master, rebased, and force pushed. That was a bad idea - I hope nothing serious got destroyed.

a-hurst · 2021-07-17T00:08:27Z

It is possible that I introduced these conflicts 😖 it's because I worked on dbe2062 before pulling master ... the pulled master, rebased, and force pushed. That was a bad idea - I hope nothing serious got destroyed.

No worries, I'll rebase before I push any more work! I've set this aside for a few days while tackling some other projects, but I'll try and get this wrapped up within the next week or so :)

a-hurst · 2021-08-04T22:21:04Z

Okay @sappelhoff, I've gotten the PrepPipeline API rewrite to a point I think I'm happy with (though I'd love to hear your thoughts and @yjmantilla's). Some of the attributes aren't documented yet and much of it is untested, but I figured it'd be best to handle that side of things once I was sure we were all happy with the changes.

My main changes here are:

Renamed EEG_before_interpolation to EEG_post_reference and reference_before_interpolation to robust_reference_signal
Added a getter method current_reference_signal that returns reference_before_interpolation if interpolation hasn't been done, or the post-interpolation average reference (previously reference_after_interpolation) if it has (and None if neither have been performed, though that could be changed).
Moved all the loose noisy_channels_original, noisy_channels_before_interpolation, bad_before_interpolation, still_noisy_channels, etc. attributes into two dicts: noisy_info (which contains the full noisy channel dicts) and bad_channels (which just contains the bad channel names), each with the keys 'original', 'post-reference', and 'post-interpolation'. This makes things easier to document and has the advantage of a more consistent/informative naming scheme.
Added current_noisy_info and remaining_bad_channels attributes that get the current noisy info and bad channel names for the pipeline (will retrieve post-interpolation values if interpolation was used, otherwise returns the pre-interpolation values). These allow for people to try enabling/disabling interpolation for their data without having to rewrite their code to access different attributes.
Added a new get_raw method, which is a more flexible version of the current raw getter that allows retrieving the full mne.io.Raw object with the EEG data from any given stage in the pipeline.

Let me know what you think!

examples/run_full_prep.py

sappelhoff · 2021-08-05T08:38:06Z

I didn't take a look at the diff because I am not sure what diff will stay after #99 (comment) is resolved, but I still have concerns over these two points (I am commenting only based on your summary comment):

returns reference_before_interpolation if interpolation hasn't been done, or the post-interpolation average reference (previously reference_after_interpolation) if it has

and

(will retrieve post-interpolation values if interpolation was used, otherwise returns the pre-interpolation values

Could we not just return a dict in both cases, with the keys pre_interpolation and post_interpolation ... and the values for these keys being None or empty dicts if these values have not yet been computed? I think that'd be clearer. An alternative would be to send a log message upon the getter call that says something like: "you are getting the post_interpolation values". The minimum would be to have an attribute that can clearly tell you whether the prep object is "pre" or "post" interpolation, but we might already have something like that, not sure right now.

What do you think?

a-hurst · 2021-08-05T15:22:19Z

Could we not just return a dict in both cases, with the keys pre_interpolation and post_interpolation ... and the values for these keys being None or empty dicts if these values have not yet been computed? I think that'd be clearer. An alternative would be to send a log message upon the getter call that says something like: "you are getting the post_interpolation values". The minimum would be to have an attribute that can clearly tell you whether the prep object is "pre" or "post" interpolation, but we might already have something like that, not sure right now.

That could work, though I still think it'd be useful to have an attribute like current_reference_signal so there's a consistent way to get the final reference signal regardless if interpolation was enabled or disabled. For the last project we did with PyPREP we tried the analysis pipeline both with and without final bad channel interpolation, so I think that having an API where you can toggle interpolation on and off without needing to change any attributes you access afterwards would be handy.

Instead of attributes, maybe methods in the style of the new get_raw function would be clearer? For example, a get_noisy_info method that returns the most recent noisy info available by default (like current_noisy_info does now), but also has an argument that lets you specify explicitly whether you want the original, post-reference, or post-interpolation noisy info.

Keep in mind that for the PrepPipeline API I decided against exposing interpolation as a separate method and instead made it an optional flag to robust_reference(), so it should hopefully be clear to users based on the settings they chose (and the documentation) whether the "current reference signal" reflects interpolated data or not.

sappelhoff

Instead of attributes, maybe methods in the style of the new get_raw function would be clearer? For example, a get_noisy_info method that returns the most recent noisy info available by default (like current_noisy_info does now), but also has an argument that lets you specify explicitly whether you want the original, post-reference, or post-interpolation noisy info.

yes! That sounds good to me, I like the new get_raw 👍 But I think the default should be "current", not "None", to be more explicit.

I approve of the remaining changes, but I am slowly losing my grasp on the potential workflows we can have with our different classes and their methods 😇 it's been too long that I actually worked with this code, I think.

sappelhoff · 2021-08-07T13:35:23Z

pyprep/prep_pipeline.py

+
+        Parameters
+        ----------
+        stage : str, optional


I think I'd do something like this:

Suggested change

stage : str, optional

stage : {"unprocessed", "filtered", "post-reference", "post-interpolation"}, optional

That causes the line to go beyond 88 characters, is line wrap for argument types something that's supported by Numpy docstyle?

sappelhoff · 2021-08-07T13:37:03Z

pyprep/prep_pipeline.py

+                    "Could not retrieve {stage} data, as that stage of the pipeline "
+                    "has not yet been performed."


Suggested change

"Could not retrieve {stage} data, as that stage of the pipeline "

"has not yet been performed."

f"Could not retrieve {stage} data, as that stage of the pipeline "

"has not yet been performed."

Whoops, nice catch!

a-hurst · 2021-08-07T15:31:51Z

yes! That sounds good to me, I like the new get_raw 👍 But I think the default should be "current", not "None", to be more explicit.

Great! I'll get to work on this, as well as proper tests and docs for everything new. In that case, for the sake of simplicity I'm going to leave the internal noisy info and reference signal attributes undocumented so that the new get_x methods are the one clear official way for users to get at that data.

I approve of the remaining changes, but I am slowly losing my grasp on the potential workflows we can have with our different classes and their methods 😇 it's been too long that I actually worked with this code, I think.

Hopefully cleaning up the example scripts for 0.4 will refresh us all on that front!

sappelhoff reviewed Jul 4, 2021

View reviewed changes

sappelhoff force-pushed the master branch from 1c9589c to 9337811 Compare July 7, 2021 09:47

sappelhoff mentioned this pull request Aug 2, 2021

Release 0.4 #101

Closed

a-hurst force-pushed the split_pipeline_methods branch from 3e67ad7 to 0688a5c Compare August 2, 2021 22:42

a-hurst mentioned this pull request Aug 3, 2021

Always keep EEG data in Volts instead of converting to mV #102

Merged

4 tasks

a-hurst added 9 commits August 4, 2021 10:59

Remove some unused code

6112c32

Make interpolation a separate Reference method

8ba133b

Remove unused MATLAB comparison test code

8ec209f

Add new separate methods for prep stages

5191d85

Make remove_line_noisy only use spectrum_fit

f98bb9f

Fix black's dict complaints

11987e2

Try fixing PREP example

fce1724

Move bad chan info to dicts, rename attributes

4e0e039

Add get_raw method for easier data access

c4bb806

a-hurst force-pushed the split_pipeline_methods branch from 0688a5c to c4bb806 Compare August 4, 2021 21:37

a-hurst added 2 commits August 4, 2021 19:30

Fix full PREP example

46256b3

Update names and scalings for PREP example

6ff2755

sappelhoff marked this pull request as ready for review August 5, 2021 08:19

sappelhoff reviewed Aug 5, 2021

View reviewed changes

examples/run_full_prep.py Show resolved Hide resolved

sappelhoff reviewed Aug 7, 2021

View reviewed changes

a-hurst mentioned this pull request Oct 21, 2021

Use the MNE logger to set the verbosity #105

Closed

sappelhoff force-pushed the master branch from bc767cf to d2c7dc0 Compare October 22, 2021 07:20

sappelhoff force-pushed the master branch from d2c7dc0 to 16c25eb Compare October 22, 2021 07:25

sappelhoff force-pushed the master branch from e900f6c to acf0d4f Compare February 2, 2022 17:44

Merge branch 'master' into split_pipeline_methods

46b164a

sappelhoff force-pushed the master branch from e89ab72 to 0ca073e Compare November 20, 2022 12:32

sappelhoff changed the base branch from master to main October 27, 2023 09:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Split PrepPipeline into separate methods, make final interpolation optional #99

[WIP] Split PrepPipeline into separate methods, make final interpolation optional #99

a-hurst commented Jul 3, 2021

a-hurst commented Jul 3, 2021

codecov-commenter commented Jul 3, 2021 •

edited

sappelhoff left a comment

sappelhoff commented Jul 15, 2021

a-hurst commented Jul 17, 2021

a-hurst commented Aug 4, 2021

sappelhoff commented Aug 5, 2021

a-hurst commented Aug 5, 2021

sappelhoff left a comment

sappelhoff Aug 7, 2021

a-hurst Aug 7, 2021

sappelhoff Aug 7, 2021

a-hurst Aug 7, 2021

a-hurst commented Aug 7, 2021

	stage : str, optional
	stage : {"unprocessed", "filtered", "post-reference", "post-interpolation"}, optional

		"Could not retrieve {stage} data, as that stage of the pipeline "
		"has not yet been performed."

[WIP] Split PrepPipeline into separate methods, make final interpolation optional #99

Are you sure you want to change the base?

[WIP] Split PrepPipeline into separate methods, make final interpolation optional #99

Conversation

a-hurst commented Jul 3, 2021

PR Description

Merge Checklist

a-hurst commented Jul 3, 2021

codecov-commenter commented Jul 3, 2021 • edited

Codecov Report

sappelhoff left a comment

Choose a reason for hiding this comment

sappelhoff commented Jul 15, 2021

a-hurst commented Jul 17, 2021

a-hurst commented Aug 4, 2021

sappelhoff commented Aug 5, 2021

a-hurst commented Aug 5, 2021

sappelhoff left a comment

Choose a reason for hiding this comment

sappelhoff Aug 7, 2021

Choose a reason for hiding this comment

a-hurst Aug 7, 2021

Choose a reason for hiding this comment

sappelhoff Aug 7, 2021

Choose a reason for hiding this comment

a-hurst Aug 7, 2021

Choose a reason for hiding this comment

a-hurst commented Aug 7, 2021

codecov-commenter commented Jul 3, 2021 •

edited