Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactions with QSIrecon output and BIDS specification #1081

Open
dkp opened this issue Jan 3, 2024 · 9 comments
Open

Interactions with QSIrecon output and BIDS specification #1081

dkp opened this issue Jan 3, 2024 · 9 comments

Comments

@dkp
Copy link

dkp commented Jan 3, 2024

This is a duplicate of an issue posted on the qsiprep site: PennLINC/qsiprep#663
I post it here because it is about a mismatch between the directory structure expected by the two tools.

I believe that the BIDS structure assumed by pyAFQ needs to be updated to handle:

  1. A derivatives directory that is a sibling of the raw BIDS directory, NOT necessarily a subdirectory of the raw BIDS directory
  2. A derivatives data structure with nesting under subject and session AND subject dwi directory. This latter structure is what qsirecon does and, as I understand it, is more in keeping with the expected structure for BIDS derivatives which should mirror the structure of the BIDS raw directory.

Specifically, the issue that we are having appears to stem from a mismatch between the output file structure of the qsirecon pyAFQ tractometry pipeline and the file structure expected by the pyAFQ library.

When we run our qsiprep preprocessed data with qsirecon we get a file structure like this:

qsirecon
├── logs
└── sub-CAM003
    ├── figures
    └── ses-01brain
        └── dwi
            └── sub-CAM003_ses-01brain_dir-AP_space-T1w_desc-preproc
                ├── bundles
                ├── clean_bundles
                ├── ROIs
                ├── viz_bundles
                └── viz_core_bundles

But when we run a pyAFQ pipeline on our data, it produces a different, shallower output file structure:

afq.pyafq/
└── sub-CAM003
    └── ses-01brain
        ├── bundles
        ├── ROIs
        ├── viz_bundles
        └── viz_core_bundles

This difference led us to try restructuring the qsirecon output to match the shallower structure. This worked, as long as we also inserted a dataset_description.json file in the parent of the derivatives directory (presumably because pyAFQ assumes the derivatives will be nested under the raw bids directory)

qsirecon
├── logs
└── sub-CAM003
    ├── figures
    └── ses-01brain
        ├── bundles
        ├── clean_bundles
        ├── ROIs
        ├── viz_bundles
        └── viz_core_bundles

Although we can generate the afq-browser information and successfully view the tractometry if we restructure the directories and add the dataset_description.json at the top level, we believe that pyAFQ should be updated to more flexibly handle the output of qsirecon directly.

@36000
Copy link
Collaborator

36000 commented Jan 3, 2024

Yes it looks like in the QSIprep case, the "sub-CAM003_ses-01brain_dir-AP_space-T1w_desc-preproc" folder should not be there

@dkp
Copy link
Author

dkp commented Jan 3, 2024

I agree that the sub-CAM003_ses-01brain_dir-AP_space-T1w_desc-preproc under the QSIrecon dwi folder is a peculiar seeming choice! But, I note that pyAFQ is not even happy with moving everything up one level to the dwi folder AND pyAFQ requires the dataset_description.json file from the raw directory to be in the parent of derivatives.

@36000
Copy link
Collaborator

36000 commented Jan 3, 2024

OK thank you for the information. I will go into qsiprep and fix this so that the dataset_description.json file is generated and the file structure is correct

@dkp
Copy link
Author

dkp commented Jan 3, 2024

Thank you! pyAFQ_tractometry is such a cool reconstruction option! I'm excited to have it be a bit easier to generate the AFQ-Browser results from the qsirecon output.

@arokem
Copy link
Collaborator

arokem commented Jan 3, 2024

Thanks for the feedback! Regarding your specific comments:

I believe that the BIDS structure assumed by pyAFQ needs to be updated to handle:

  1. A derivatives directory that is a sibling of the raw BIDS directory, NOT necessarily a subdirectory of the raw BIDS directory.

I believe that's already the case, isn't it?

  1. A derivatives data structure with nesting under subject and session AND subject dwi directory. This latter structure is what qsirecon does and, as I understand it, is more in keeping with the expected structure for BIDS derivatives which should mirror the structure of the BIDS raw directory.

I think that you are right, and we need to add the intermediate "dwi" level to our derivative structure to better conform with how BIDS derivatives are expected to be organized.

@36000
Copy link
Collaborator

36000 commented Jan 3, 2024

I think they are pointing out how pyAFQ outputs when run through QSIprep are in a weird folder structure with no dataset_description.json which makes it hard to then generate the AFQ-browser instance (the normal pyAFQ results are fine). Another question here is whether we want to continue to support AFQ-browser, because at some point it will be replaced by https://github.com/nrdg/tractoscope ?

@dkp
Copy link
Author

dkp commented Jan 3, 2024

Hi @arokem,

  1. As far as we can tell, pyAFQ looks for a dataset_description.json file in the parent of derivatives. This assumes derivatives is a child of the BIDS raw directory (and not a sibling):

Here is what we did (please let us know if we missed an opportunity):
(In this example, dataset_description.json must appear under bids AND the derivatives must be nested under bids as well).

import os

import AFQ.data.fetch as afd
from AFQ.api.group import GroupAFQ
from AFQ.definitions.image import ImageFile

brain_mask_definition = ImageFile(
    suffix="mask",
    filters={'desc': 'brain',
             'space': 'T1w',
             'scope': 'qsiprep'})

my_afq = GroupAFQ(
  bids_path=os.path.join(os.getcwd(), 'bids'),
  preproc_pipeline="qsiprep",
  # following works on modified qsirecon dir if dataset_description.json is in killgore:
  output_dir="bids/derivatives/qsirecon",
  bids_layout_kwargs={"validate": False, "index_metadata": False},
  brain_mask_definition=brain_mask_definition
)

print(f"my_afq={my_afq}")
my_afq.export_all()

This code only works IF there is a default skeleton dataset_description.json directly under bids (i.e., above the derivatives directory). This would only be the case if derivatives are nested UNDER the BIDS raw directory (which used to be the BIDS standard default). However, if derivatives are a sibling to bids, then our code fails because it can't find dataset_description.json.

@36000 I think qsirecon is closer to correct, and that:

  • pyAFQ should handle the case where the dataset_description.json is NOT in the parent of derivatives
    AND
  • pyAFQ should handle looking down under subject/session into a dwi directory.

All of that said, qsirecon nests the pyAFQ_tractometry output under an extra level *_dir-AP_space-T1w_desc-preproc in the subj/session/dwi directory. This seems odd.

I hope you will consider supporting these changes as it looks like tractoscope is a ways off.

@arokem
Copy link
Collaborator

arokem commented Jan 4, 2024

pyAFQ should handle the case where the dataset_description.json is NOT in the parent of derivatives

I don't think that's right. The parent folder of the derivative folder is the overall BIDS study folder, which should always have a dataset_description file. It may also contain a folder with the raw data, but it doesn't have to.

pyAFQ should handle looking down under subject/session into a dwi directory.

Yes - agreed about that. I think we need to change pyAFQ to add this level of nesting.

@dkp
Copy link
Author

dkp commented Jan 4, 2024

As per https://bids-specification.readthedocs.io/en/stable/common-principles.html#storage-of-derived-datasets:

Derivatives can be stored/distributed in two ways:

  1. Under a derivatives/ subdirectory in the root of the source BIDS dataset directory to make a clear distinction between raw data and results of data processing.
  2. As a standalone dataset independent of the source (raw or derived) BIDS dataset.

I have switched to method 2 for all processing as it has a number of advantages. BIDS tools like fmriprep, MRIQC and qsiprep all support method 2. It'd be helpful if pyAFQ also supported method 2.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants