Fix issues in statistical testing #107

daehrlich · 2024-01-18T11:37:46Z

This pull request addresses two issues with the current implementation of the statistical testing:

Wrong unsorting
In both _perform_fdr_correction and max_statistic_sequential in stats.py, the p-values are first sorted and their argsort is saved. After the statistical testing, the resulting significance-arrays are supposed to be unsorted, i.e. brought back into the original order, in order to filter out the non-significant targets or links.

However, this unsorting is not correctly implemented, as the p-value argsort from before is used in a wrong way. The implementation reads

significance_array = significance_array[argsort_idxs],

where it should instead read

significance_array[argsort_idxs] = significance_array.copy()

where the copy is necessary to avoid concurrent modification.

Incorrect number of total significance tests in fdr test
To get the correct p-value thresholds for the fdr correction, the formula requires the total number of tests that are performed. In _perform_fdr_correction, this number n is computed from the number of p-values that are submitted. However, this is not equal to the total number of comparisons in this multiple comparison testing problem, since in the function network_fdr, the targets are first filtered by their omnibus significance.
I believe this is wrong, and the filtering step should be eliminated so that the n in the fdr correction does in fact coincide with the total number of targets (if correct_by_target==True) or links (if correct_by_target==False).

I have implemented fixes to both issues in the branch dev_fdr_fix. Since these fixes affect the results that idtxl produces, I strongly recommend a thorough code review of the changes before a merge is performed.

mwibral · 2024-01-25T13:29:38Z

I agree with the detected bugs.
Sorting should be corrected.

For the case of correct_by_target==True the number of comparisons should indeed be the number of targets (as per Novelli et al., Network Neuroscience, 2019).

For the case correct_by_target==False one should correct by:
number_of_candidates = number_of_unique_combinations_of_timelags_and_source_processes.

This is because at the moment IDTxl only provides p-values per candidate, not per link as fas as I know.
(But this should be double checked! Although I doubt it, there maybe per link mTE and p-values prewsent in the returned results -- if so, correction of the per-link p-values by the number of links in the network would be preferable).

Note: an efficient test at the level of candidates (sources x lags) would require to test all candidates across all targets simultaneously and enter them into the maximum statistics for inclusion. This non-hierachical testing is not supported in IDTxl at the moment and would require a major rework of the full code-path.

…t==False

pwollstadt · 2024-02-02T14:40:58Z

Merged into develop

daehrlich added 2 commits January 18, 2024 12:23

Fix: Corrected unsorting in fdr test

194fd0d

Fix: Removed filtering by omnibus significance in fdr correction

a0e735e

daehrlich added the bug label Jan 25, 2024

daehrlich linked an issue Jan 25, 2024 that may be closed by this pull request

Issues in statistical testing #108

Open

daehrlich added 2 commits January 25, 2024 17:20

Fix: Removed filtering in fdr correction for AIS

3eb6304

Fix: Correctly compute the total number of tests for correct_by_targe…

1330787

…t==False

PWollstadtHRI assigned PWollstadtHRI and pwollstadt Jan 26, 2024

pwollstadt closed this Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issues in statistical testing #107

Fix issues in statistical testing #107

daehrlich commented Jan 18, 2024 •

edited

mwibral commented Jan 25, 2024 •

edited

pwollstadt commented Feb 2, 2024

Fix issues in statistical testing #107

Fix issues in statistical testing #107

Conversation

daehrlich commented Jan 18, 2024 • edited

mwibral commented Jan 25, 2024 • edited

pwollstadt commented Feb 2, 2024

daehrlich commented Jan 18, 2024 •

edited

mwibral commented Jan 25, 2024 •

edited