Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ds000001 example processed with SPM #309

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

cmaumet
Copy link

@cmaumet cmaumet commented Mar 18, 2022

This PR includes an example of BIDS-derivatives dataset: ds000001 processed using SPM (using code available in https://github.com/NISOx-BDI/Software_Comparison/tree/bids_derivatives).

This dataset will be used as an example use-case for BEP028 (provenance)


  • README
  • dataset_description.json

@Remi-Gau
Copy link
Contributor

Remi-Gau commented Mar 19, 2022

Couple of comments on the preprocessed data only as BIDS has nothing official to say on stats output (yet).

Also this is an example of derivative data, so I might be extra annoying to make it "extra BIDS compliant" because it could definitely help implementing validation later on. Feel free to push back on my nitpicking.

  • Should the derivatives include the raw files? (IMHO: no)

I know that this is a very SPM thing to do but I feel this goes against the modularisation of "raw" and "derivatives" that BIDS tends to promote. Also this also leads to data duplication.

ds000001-spm/sub-01/anat
├── sub-01_from-T1w_to-IXI549Space_mode-image_xfm.nii.gz
├── sub-01_label-T1w_segparam.mat
├── sub-01_space-individual_desc-biascor_T1w.nii.gz
├── sub-01_space-individual_desc-skullstripped.gz
├── sub-01_space-individual_label-CSF_probseg.nii.gz
├── sub-01_space-individual_label-GM_probseg.nii.gz
├── sub-01_space-individual_label-WM_probseg.nii.gz
├── sub-01_space-IXI549Space_desc-preproc_T1w.nii.gz
└── sub-01_T1w.nii.gz                                                         <-- original raw file?
  • Mix of .nii and .nii.gz

find ds000001-spm/ -name '*.nii' shows that subject 15 and 16 (and few others) have .nii files while most others have .nii.gz.

Might be better to make everything .nii.gz.

  • missing sidecar JSON

Only highlighting the missing REQUIRED bits.

For preprocessed data, it should be mentioned if they are skullstripped or not
https://bids-specification.readthedocs.io/en/latest/05-derivatives/03-imaging.html#preprocessed-coregistered-andor-resampled-volumes

When using the individual-space, the reference MUST me mentoned:
https://bids-specification.readthedocs.io/en/latest/05-derivatives/02-common-data-types.html#examples_1

REQUIRED fields for the RAW should be propagated to the derivatives:
https://bids-specification.readthedocs.io/en/latest/05-derivatives/01-introduction.html#metadata-conventions

In practice this mostly means TaskName and RepetitionTime for the func bold timeseries.

@Remi-Gau
Copy link
Contributor

Also added some 'tick box' in the top message for other things TODO

Co-authored-by: Remi Gau <remi_gau@hotmail.com>
@cmaumet
Copy link
Author

cmaumet commented Mar 21, 2022

Thanks a bunch @Remi-Gau!

The previous commits includes:

  • Deletion of raw anat and func files
  • All nifti files in .nii.gz
  • Adding missing README + dataset_description.json

I'll work on the JSON files next!

@Remi-Gau
Copy link
Contributor

Remi-Gau commented Mar 21, 2022

I'll work on the JSON files next!

Suspect you can make your life easier by using the inheritance principle for many of the bold files.

@effigies
Copy link
Contributor

Is it possible to submit a small subset of subjects? 2000+ files is hard on the browser, so I can't really review what's in here.

"Name": "SPM - Software Comparison workflow",
"BIDSVersion": "1.4.0",
"DatasetType": "derivative",
"License": "This dataset is made available under the Public Domain Dedication and License \nv1.0, whose full text can be found at \nhttp://www.opendatacommons.org/licenses/pddl/1.0/. \nWe hope that all users will follow the ODC Attribution/Share-Alike \nCommunity Norms (http://www.opendatacommons.org/norms/odc-by-sa/); \nin particular, while not legally required, we hope that all users \nof the data will acknowledge the OpenfMRI project and NSF Grant \nOCI-1131441 (R. Poldrack, PI) in any publications."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"License": "This dataset is made available under the Public Domain Dedication and License \nv1.0, whose full text can be found at \nhttp://www.opendatacommons.org/licenses/pddl/1.0/. \nWe hope that all users will follow the ODC Attribution/Share-Alike \nCommunity Norms (http://www.opendatacommons.org/norms/odc-by-sa/); \nin particular, while not legally required, we hope that all users \nof the data will acknowledge the OpenfMRI project and NSF Grant \nOCI-1131441 (R. Poldrack, PI) in any publications."
"License": "This dataset is made available under the Public Domain Dedication and License \nv1.0, whose full text can be found at \nhttp://www.opendatacommons.org/licenses/pddl/1.0/. \nWe hope that all users will follow the ODC Attribution/Share-Alike \nCommunity Norms (http://www.opendatacommons.org/norms/odc-by-sa/); \nin particular, while not legally required, we hope that all users \nof the data will acknowledge the OpenfMRI project and NSF Grant \nOCI-1131441 (R. Poldrack, PI) in any publications."
Suggested change
"License": "This dataset is made available under the Public Domain Dedication and License \nv1.0, whose full text can be found at \nhttp://www.opendatacommons.org/licenses/pddl/1.0/. \nWe hope that all users will follow the ODC Attribution/Share-Alike \nCommunity Norms (http://www.opendatacommons.org/norms/odc-by-sa/); \nin particular, while not legally required, we hope that all users \nof the data will acknowledge the OpenfMRI project and NSF Grant \nOCI-1131441 (R. Poldrack, PI) in any publications."
"License": "This dataset is made available under the Public Domain Dedication and License \nv1.0, whose full text can be found at \nhttp://www.opendatacommons.org/licenses/pddl/1.0/. \nWe hope that all users will follow the ODC Attribution/Share-Alike \nCommunity Norms (http://www.opendatacommons.org/norms/odc-by-sa/); \nin particular, while not legally required, we hope that all users \nof the data will acknowledge the OpenfMRI project and NSF Grant \nOCI-1131441 (R. Poldrack, PI) in any publications.",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing comma

Copy link
Contributor

@Remi-Gau Remi-Gau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @cmaumet

had a chat with the bids maintainers on our last meeting.

A) Could you only include a couple of subjects in the dataset and not all of them ?

Because:

  • it makes it hard to review,
  • after the second subject it gets redundant for an example,
  • we expect a growing number of derivatives dataset examples in the coming months / years so keeping of them to a small number of files is going to be important for long term mantenance

It is possible to mention this in the README and to point to where the full dataset with non truncated data can be found.

Several points that will help this dataset more valid as derivatives

B) Could the SCRIPTS folder be renamed code (one of the allowed folders in the root of a BIDS directory)?

C) batch_rename.mat should probably be moved in that code folder.

D) the ONSETS/*.mat (I suspect they contain the Names, Onsets, Duration required by SPM for model specification) could probably be moved somewhere else:

  • within code?
  • or each .mat``file could be moved in the level1` folder of each subject (can be useful if there are several models to keep those onsets files "close" to the stats results they relate to) ?

@@ -0,0 +1,3 @@
{
"SkullStripped": false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"SkullStripped": false,
"SkullStripped": false

Copy link
Contributor

@Remi-Gau Remi-Gau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing T1w suffix and .nii for some files

sub-*_space-individual_desc-skullstripped.gz -->
sub-*_space-individual_desc-skullstripped_T1w.nii.gz

@Remi-Gau
Copy link
Contributor

trying to come up with a decent .bidsignore to make things BIDS derivatives compatible if we removed the .SKIP_VALIDATION and had a validator that can handle derivatives.

distinguishing:

  • valid BIDS derivatives files (currently described in the specs) that by definition are not in the .bidsignore
  • BIDS derivatives files described in BEP in the bidsingore for now
  • typical SPM flles whose name can be bidsiied without breaking SPM internal machinery
  • typical SPM flles whose name CANNOT be bidsiied without breaking SPM internal machinery

Suggested .bidsignore

### SPM specific output 

## BIDSified filenames

# anat

# deformation fields not supported yet (see BEP 14 https://bids.neuroimaging.io/bep014)
*xfm.nii.gz

# segmentation parameters
sub-*/anat/sub-*_label-T1w_segparam.mat

# func

# realignment parameters
sub-*/func/sub-*_task-*_desc-confounds_regressors.txt

## non BIDSified names (renaming those would break SPM internal machinery)

# stats

SPM.mat
beta_*.nii*
RPV.nii*
ResMS.nii*
mask.nii*
con_*.nii*
spmT_*.nii*
spmF_*.nii*

*nidm.zip    # filename could be bidsified
*.pdf        # filename could be bidsified
*.ps         # filename could be bidsified

@cmaumet

I am having second thought on the Might be better to make *everything* .nii.gz.

For example zipping beta*.nii.gz will make SPM scream. From a data management perspective having .gz is preferable. As a SPM user having to unzipping things before doing stats is inconvenient.

What kind of examples should we set ?

@Remi-Gau
Copy link
Contributor

Remi-Gau commented Apr 24, 2022

@effigies would having this kind of .bidsignore help non-SPMers review this PR?

@Remi-Gau
Copy link
Contributor

Random thought:

sub-01_task-balloonanalogrisktask_run-01_desc-confounds_regressors.txt

should probably be

sub-01_task-balloonanalogrisktask_run-01_motion.txt

according to the BEP on funcitonal derivatives: bids-standard/bids-specification#519

ideally it should even be a TSV files with proper headers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants