Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add PSD feature via MNE #73

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

anandsaini024
Copy link
Contributor

PSD feature creation script.

psd feature creation script
@anandsaini024 anandsaini024 changed the title [ENH] Add PSD feature via MNECreate psd.py [ENH] Add PSD feature via MNE Jul 19, 2022
@adam2392
Copy link
Member

Just to clarify, how are these and the topographic PR different from what is included in ICLabel?

I might've missed this discussion.

@codecov
Copy link

codecov bot commented Jul 19, 2022

Codecov Report

Merging #73 (07888b3) into main (33856df) will increase coverage by 0.08%.
The diff coverage is n/a.

❗ Current head 07888b3 differs from pull request most recent head e1f1d6a. Consider uploading reports for the commit e1f1d6a to get more accurate results

@@            Coverage Diff             @@
##             main      #73      +/-   ##
==========================================
+ Coverage   86.76%   86.84%   +0.08%     
==========================================
  Files          15       16       +1     
  Lines         831      859      +28     
  Branches      101      105       +4     
==========================================
+ Hits          721      746      +25     
- Misses         82       83       +1     
- Partials       28       30       +2     
Impacted Files Coverage Δ
main/mne_icalabel/utils/_docs.py 93.10% <0.00%> (ø)
main/mne_icalabel/features/psd.py 89.28% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@mscheltienne
Copy link
Member

mscheltienne commented Jul 19, 2022

@adam2392 The ICLabel ones are a copy/paste of the MATLAB features.. which are not super clean and nice.
For instance, the ICLabel rpsd feature is supposed to have a random component right? Well.. in practice that's not the case because it does something along the lines of:

some_2d_matrix = [..., ...]
subsets = [indices of all columns]  # instead of an actual subset
select_and_shuffle = shuffle(some_2d_matrix [:, subsets])  # shuffle the columns
average = avg(select_and_shuffle, axis=1)  # avg across columns

And that's not the only example of 'weird' coding choices/behaviors. I added comment in the features.py file describing those items when I ran into them, e.g.

# in MATLAB, 'pct_data' variable is never provided and is always initialized
# to 100. 'pct_data' is only used in an 'if' statement reached if 'pct_data'
# is different than 100.. thus, 'pct_data' is not used by this autocorrelation
# function and is omitted here.

which corresponds to the pct_data variable in this file https://github.com/sccn/ICLabel/blob/d0cfb0f8638cd4518ee2e66a3bff0bb32ff40969/eeg_autocorr_welch.m#L1-L25

variable which is never provided to the function https://github.com/sccn/ICLabel/blob/d0cfb0f8638cd4518ee2e66a3bff0bb32ff40969/ICL_feature_extractor.m#L77

Or this one:

# In MATLAB, 2 scenarios are defined:
# - EEG.pnts < EEG.srate, which never occurs since then raw provided to
# this autocorrelation function last at least 5 second.
# - EEG.pnts > EEG.srate, implemented below.

which corresponds to https://github.com/sccn/ICLabel/blob/d0cfb0f8638cd4518ee2e66a3bff0bb32ff40969/eeg_autocorr_welch.m#L36-L44


Overall, I'm not super confident in the ICLabel features, so I asked Anand to prepare functions to extract features from Instance/ICA decomposition using MNE, starting with the topographic map and the PSD based on the ica.plot_properties method. The idea is to use those functions to extract features for future models.

@adam2392
Copy link
Member

FYI to fix the style easily and the sorting of imports:

black ./mne_icalabel/* and isort ./mne_icalabel commands, or type make black and make isort. The other checks like flake8 (style), pydocstyle, and mypy can be fixed manually.

doc/whats_new.rst Outdated Show resolved Hide resolved
mne_icalabel/features/psd.py Outdated Show resolved Hide resolved
mne_icalabel/features/psd.py Outdated Show resolved Hide resolved
mne_icalabel/features/psd.py Outdated Show resolved Hide resolved
mne_icalabel/features/psd.py Outdated Show resolved Hide resolved
mne_icalabel/features/psd.py Outdated Show resolved Hide resolved
mne_icalabel/features/tests/test_psd.py Outdated Show resolved Hide resolved
anandsaini024 and others added 3 commits July 22, 2022 15:02
Co-authored-by: Mathieu Scheltienne <mathieu.scheltienne@gmail.com>
doc/api.rst Show resolved Hide resolved
Comment on lines 16 to 19
def test_psd_feature():
"""Test scalp topography array generation"""
psds_mean = get_psds(ica, raw)
assert isinstance(psds_mean, np.ndarray)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a better test perhaps? Anything that can be inspired from MNE-Python test suite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely.

@mscheltienne
Copy link
Member

@anandsaini024 Please look into additional tests as we discussed:

  • test inspired from tests of the PSD functions in MNE
  • test on signals with known frequency content, e.g. sinusoid, amplitude modulated signal, ...
  • test with different inputs for the arguments of get_psds

I'll try to double-check this PR this week.

@adam2392 adam2392 mentioned this pull request Aug 4, 2022
Copy link
Member

@mscheltienne mscheltienne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anandsaini024 Please check that the output you are getting from the function makes sense also when you vary the different arguments.
For instance, what happens when you provide a raw which does have "bad" annotations and reject_by_annotation was set to True? Or to False?
And so on.. please test all the arguments in different scenarios to make sure they each work. Those tests that you will do manually at the beginning can then be incorporated in test_psd_arguments.

Comment on lines +47 to +56
with pytest.raises(ValueError):
get_psds(ica, inst, picks="eeg")
with pytest.raises(ValueError, match="All picks must be < n_channels"):
get_psds(ica, inst, picks=10)
with pytest.raises(ValueError):
get_psds(ica, inst, fmax=[210, 410])
with pytest.raises(ValueError):
get_psds(ica, inst, fmax="string")
with pytest.raises(ValueError):
get_psds(ica, inst, normalization="string")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the match keyword and either part of the error message, the entire error message, or a regex pattern that matches the error. This will avoid catching ANY ValueError.

with pytest.raises(Error, match="error I want to test against"):
    function_that_raises_that_error()

Also, if you notice that an error message is not super clear, then you can start checking the user input at the beginning of the function. I've done it for fmin and fmax, but any error message that does not give clear guidelines to the user on how to correct it is not good enough.

@mmagnuski
Copy link
Member

Hi, I popped in to see whats new in mne-icalabel (great project!) and saw your discussion of pct_data variable and how it is not used in the Matlab code. I think (but I may be wrong) that it is related to how EEGLAB usually computed PSD - to save time the spectrum was computed only for a portion of the data (especially important for very long raw recordings and the fact to you expect instant feedback when working via GUI). This option was controlled with a special parameter that in research usage would be set to 100. I think this may be the reason for the behavior of pct_data you see.

@mscheltienne
Copy link
Member

Maybe, I didn't dig in too far, but it looked to me like the feature extraction code I translated was half-baked.
And I think the unused pct_data variable appears both in the PSD and autocorrelation features.

Also, I'll finalize this PR at some point ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants