Adds ParallelGroupAFQ #1124

teresamg · 2024-04-10T17:39:19Z

Creates a ParallelGroupAFQ class inheriting from GroupAFQ to allow for easier parallelization through pydra. Does not currently work; in progress.

pep8speaks · 2024-04-10T17:39:24Z

Hello @teresamg! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file AFQ/api/group.py:

Line 352:80: E501 line too long (86 > 79 characters)
Line 1012:1: W293 blank line contains whitespace

Comment last updated at 2024-05-15 23:45:53 UTC

arokem

This is a really neat implementation now! My main suggestion is to change ParticipantAFQ so that the names of kwargs and attributes are well-matched, so that you can make this even neater.

AFQ/api/group.py

arokem · 2024-04-18T18:53:40Z

The code looks quite neat now!

@36000 : could you take a quick look when you get a chance?

Any ideas about how we can test this? I guess that we'd have to set this up as a nightly test, possibly running this parallelized with concurrent futures across two HBN subjects?

36000 · 2024-04-18T20:28:35Z

my thoughts, I will implement these:

add a similar looking export function to wrap export like the export_all implemented here
add pydra to setup.cfg
add a nighly test on 2 hbn subjects using concurrent futures for pydra

AFQ/api/group.py

AFQ/api/participant.py

36000 · 2024-04-22T20:48:04Z

I made some minor changes here. I made pydra a required library for pyAFQ (its pip installable with few dependencies, this will just be easier for most users). We now try to catch that error Ariel mentioned (let me know if this doesnt work). And I added an export function based on the export_all function. Next I will add the nightly test then I think we can merge this

36000 · 2024-04-22T21:12:17Z

I wrote this test:

def test_AFQ_pydra():
    _, bids_path = afd.fetch_hbn_preproc(["NDARAA948VFH", "NDARAV554TP2"])
    pga = ParallelGroupAFQ(bids_path, preproc_pipeline="qsiprep")
    pga.export_all()

and i am getting this error:

FAILED AFQ/tests/test_api.py::test_AFQ_pydra - ValueError: Split is missing values for the following fields ['pAFQ_kwargs']

Either of you two know what causes this?

arokem · 2024-04-24T14:39:54Z

Could you please push that test to this PR? I'm not sure what's up and would like to try to debug locally.

36000 · 2024-04-24T19:05:13Z

Sure, I figured out this was due to me using a different pydra version (the latest, 0.23). But now I am getting picking problems. I will push some of the changes I have made, but I think maybe we will have to talk about this in person at some point. I am running into a few issues

36000 · 2024-05-07T01:19:12Z

@teresamg mind trying this to see if it works?
I factored out bidslayout

teresamg · 2024-05-15T11:49:51Z

Looks like it ran successfully, both locally and on Hyak!

arokem · 2024-05-15T13:38:00Z

Did it generate the combined tract profiles file? I also ran a test on hyak:

import os.path as op

from AFQ.api.group import ParallelGroupAFQ
from AFQ.definitions.image import RoiImage
import AFQ.api.bundle_dict as abd
import AFQ.data.fetch as afd


_, bids_path = afd.fetch_hbn_preproc(
        ["NDARZT957CWG",
         "NDARZU279XR3",
         "NDARZU401RCU",
         "NDARZU822WN3",
         "NDARZV421TCZ",
         "NDARZW262ZLV",
         "NDARZX163EWC",
         ],
        path="/gscratch/scrubbed/arokem/data/")
    
my_afq = ParallelGroupAFQ(
    bids_path=bids_path,
    preproc_pipeline="qsiprep",
    parallel_params={
        "submitter_params": {
            "plugin": "slurm",
            "sbatch_args": "-J test \
                            -p ckpt \
                            --nodes=1 \
                            --cpus-per-task=8 \
                            --gpus=1 \
                            -A escience \
                            --mem=64G \
                            --time=2:00:00 \
                            -o /gscratch/scrubbed/arokem/logs/test.out \
                            -e /gscratch/scrubbed/arokem/logs/test.err \
                            --mail-user=arokem@uw.edu \
                            --mail-type=ALL"
        },
        "cache_dir": "/gscratch/scrubbed/arokem/tmp"
    }
)

my_afq.export_all()

which worked great at individual subject level, but I can't find the final combined tract profiles file, which I was expecting to find.

teresamg · 2024-05-15T15:56:03Z

It did for me... does each subject have the *_desc-profiles_dwi.csv?

36000 · 2024-05-15T16:38:23Z

AFQ/api/group.py

+                **pAFQ_kwargs.kwargs)
+            pAFQ.export_all(viz, xforms, indiv)
+
+            for dir in finishing_params["output_dirs"]:


I am a little confused at what these lines here are trying to do

The idea behind this is that we want to generate a group-level csv file that contains the merged tract profiles. But we only want to do this once, so each task checks whether all of the other tasks have completed. If they have not, it returns and nothing happens. But if it is the last task running, the condition is always fulfilled and the code continues to lines 1124-1126, which run one last GroupAFQ, which would generate the combined tract profile. I worry that this is not super robust, though, and could potentially run into some funky race conditions. In my own test on hyak, I did not get the combined tract profile did not work. So, maybe we need to do something a bit more direct to create the group-level derivatives.

1120 is checking whether the process is the last process running by trawling for each subject's *_desc-profiles_dwi.csv. If all are present, it exports the GroupAFQ object to create the final tract_profiles.csv.

OK that makes sense

I wonder if pydra has some way to do this through their API

Ariel, if all csvs are present, perhaps you can print "output_dirs" and double check that none of the paths are wonky or null? Is there anything unusual in test.out or test.err?

Oh yeah, good call:

slurmstepd: error: *** JOB 18301898 ON g3030 CANCELLED AT 2024-05-14T22:33:02 DUE TO TIME LIMIT ***

I will try this again with a longer max time limit than two hours.

I think it would be tricky to do this through pydra, because we'd have to somehow leave the submitting process running somehow for the duration of all the sub-processes, and that may become very cumbersome for very large datasets. I think that assuming we don't leave the program running until all tasks are completed, we have two options:

Something like what is being done here.

Don't merge and let users do that separately in a separate program.

I suggest that if my current test works and we get what we expect that we go ahead and merge this PR as is, including this bit of code. I can follow up with a documentation example, based on my current experiments with the HBN data. Users need to know that the final merge into a tract profiles file will fail if any of the sub-tasks fail, so maybe we just need to clarify this in the documentation.

Or is there anything else we need to address before merging?

I agree. This looks good to me!

I'm not sure what you mean about the submitting process... it ends as soon as the jobs are submitted, not completed, and its the last job that does the final merge (all of them check for it). Plus GroupAFQ also fails to produce tract_profiles.csv if anything fails.

AFQ/api/group.py

arokem · 2024-05-16T02:59:11Z

I don't know if this is directly related to the content of this PR, but I am now getting an error in visualizing the standard set of bundles, where the visualization code is raising:

INFO:AFQ:Generating colorful lines from tractography...
KeyError: 'Forceps Minor'

Presumably because it's looking for a tract that is now no longer part of the default set of tracts (because we're using the more granular set of CC tracts).

@36000 : any chance this is related to changes you introduced here to how data is passed between GroupAFQ and ParticipantAFQ?

If you think this is unrelated, I think we can probably merge this PR, and fix this issue elsewhere.

36000 · 2024-05-16T15:45:07Z

Not sure what would cause this, but I think it is unrelated, as removing overlapping bundle definitions happens in the init method of the BundleDict, so I don't think a race condition is causing this error.

Adds ParallelGroupAFQ

5f40f70

teresamg added 4 commits April 11, 2024 19:43

Recursion fix and other updates

59ad602

Changes to in-place method

9761722

Adds comments

f6794f5

ParticipantAFQ method

d24edd6

arokem requested changes Apr 17, 2024

View reviewed changes

AFQ/api/group.py Outdated Show resolved Hide resolved

AFQ/api/group.py Outdated Show resolved Hide resolved

AFQ/api/group.py Outdated Show resolved Hide resolved

teresamg added 2 commits April 18, 2024 11:28

Updates ParticipantAFQ kwargs

5fc2a13

Updates pydra import

1e67bd7

arokem reviewed Apr 18, 2024

View reviewed changes

AFQ/api/group.py Outdated Show resolved Hide resolved

Updates attribute name

4ba7307

arokem marked this pull request as ready for review April 18, 2024 21:08

arokem changed the title ~~WIP: Adds ParallelGroupAFQ~~ Adds ParallelGroupAFQ Apr 18, 2024

arokem reviewed Apr 18, 2024

View reviewed changes

AFQ/api/participant.py Show resolved Hide resolved

add export function, make pydra required, catch error

1a23004

putting together a pydra test

16e169b

teresamg and others added 6 commits May 2, 2024 11:53

Adds final GroupAFQ step

42e4b49

Merge branch 'pydra' of github.com:teresamg/pyAFQ into pydra

4cfd044

Corrects accidental overwrites

a1ab9ec

Corrects accidental overwrites

b1f2cd5

factor out bids from tasks, have groupAFQ handle bidslayout upfront

0157e6a

update this test

f58bb95

36000 mentioned this pull request May 6, 2024

[REF] Refactor segmentation code into many files and pimms system #1132

Open

teresamg added 2 commits May 13, 2024 16:25

Updates syntax for setting defaults

e7a899a

Adds comment about AttributeError

5109ffc

36000 reviewed May 15, 2024

View reviewed changes

arokem reviewed May 15, 2024

View reviewed changes

AFQ/api/group.py Outdated Show resolved Hide resolved

Update AFQ/api/group.py

4a3465b

36000 merged commit 6d0ffb6 into yeatmanlab:master May 16, 2024
9 checks passed

teresamg deleted the pydra branch May 16, 2024 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds ParallelGroupAFQ #1124

Adds ParallelGroupAFQ #1124

teresamg commented Apr 10, 2024

pep8speaks commented Apr 10, 2024 •

edited

arokem left a comment

arokem commented Apr 18, 2024

36000 commented Apr 18, 2024

36000 commented Apr 22, 2024

36000 commented Apr 22, 2024

arokem commented Apr 24, 2024

36000 commented Apr 24, 2024

36000 commented May 7, 2024 •

edited

teresamg commented May 15, 2024

arokem commented May 15, 2024

teresamg commented May 15, 2024

36000 May 15, 2024

arokem May 15, 2024

teresamg May 15, 2024

36000 May 15, 2024

36000 May 15, 2024

teresamg May 15, 2024

arokem May 15, 2024

arokem May 15, 2024

36000 May 15, 2024

teresamg May 15, 2024

arokem commented May 16, 2024

36000 commented May 16, 2024

Adds ParallelGroupAFQ #1124

Adds ParallelGroupAFQ #1124

Conversation

teresamg commented Apr 10, 2024

pep8speaks commented Apr 10, 2024 • edited

Comment last updated at 2024-05-15 23:45:53 UTC

arokem left a comment

Choose a reason for hiding this comment

arokem commented Apr 18, 2024

36000 commented Apr 18, 2024

36000 commented Apr 22, 2024

36000 commented Apr 22, 2024

arokem commented Apr 24, 2024

36000 commented Apr 24, 2024

36000 commented May 7, 2024 • edited

teresamg commented May 15, 2024

arokem commented May 15, 2024

teresamg commented May 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arokem commented May 16, 2024

36000 commented May 16, 2024

pep8speaks commented Apr 10, 2024 •

edited

36000 commented May 7, 2024 •

edited