Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add "run up to this step" option #858

Open
larsoner opened this issue Feb 23, 2024 · 7 comments
Open

ENH: Add "run up to this step" option #858

larsoner opened this issue Feb 23, 2024 · 7 comments

Comments

@larsoner
Copy link
Member

          > So would would have been a viable solution for @SophieHerbst now? Simply re-run the entire `preprocessing` pipeline and rely on caching to skip e.g. filtering etc?

We don't have a "please run everything up to this step" functionality right now, do we? Do you think we could implement this somehow?

Originally posted by @hoechenberger in #857 (comment)

Might be a good idea to suggest that in general people prefer this to the --steps option, which is config-change-unsafe.

@hoechenberger
Copy link
Member

hoechenberger commented Feb 23, 2024

Since we do have caching now, I would even go so far as to say that we could replace the existing --steps behavior with the new one – if we can guarantee that caching actually works well, including on an NFS setup

That way, we'd also avoid having to construct a dependency tree for each step (in order to ensure that all required input has been generated)

We could actually deprecate --steps in favor of ... --run-until? or something? Because supplying multiple step names wouldn't make sense anymore

@SophieHerbst
Copy link
Collaborator

I find the --run-until clearer than --step, with respect to what is actually done. The doc should than make clear that only steps for which a config change occurred will be rerun.
Is it a too complex to mark for each config option which step it refers to?

@hoechenberger
Copy link
Member

Is it a too complex to mark for each config option which step it refers to?

No, i think this could be generated automatically these days

@SophieHerbst
Copy link
Collaborator

Is it normal that the preprocessing pipeline is run again after an update of the pipeline?
pip install -U --no-deps git+https://github.com/mne-tools/mne-bids-pipeline@main
I changed nothing in my config.
This is a bit problematic, because everytime I manage to free some time to work on this, a good part of it is taken up by just waiting for previous steps to recompute.

┌────────┬ init/_01_init_derivatives_dir ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│10:56:07│ ✅ Output directories already exist …
└────────┴ done (20s)
┌────────┬ init/_02_find_empty_room ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│10:56:27│ ✅ sub-155 run-01 Computation unnecessary (cached) …
└────────┴ done (5s)
┌────────┬ preprocessing/_01_data_quality ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│10:56:32│ ✅ sub-155 run-01 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-02 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-03 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-04 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-05 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-06 Computation unnecessary (cached) …
│10:56:32│ ✅ sub-155 run-07 Computation unnecessary (cached) …
│10:56:33│ ✅ sub-155 run-08 Computation unnecessary (cached) …
│10:56:33│ ✅ sub-155 run-rest Computation unnecessary (cached) …
│10:56:33│ ✅ sub-155 run-noise Computation unnecessary (cached) …
└────────┴ done (2s)
┌────────┬ preprocessing/_02_head_pos ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│10:56:33│ ⏩ Skipping …
└────────┴ done (1s)
┌────────┬ preprocessing/_03_maxfilter ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│10:56:46│ ⏳️ sub-155 run-01 Loading reference run: 01.
│10:57:55│ ⏳️ sub-155 run-01 Applying SSS to experimental data
│10:57:59│ ⏳️ sub-155 run-01 Destination is 0.0 mm and 0.0° from the original head position
│10:58:06│ ⏳️ sub-155 run-01 Writing sub-155/meg/sub-155_task-tiwm_run-01_proc-sss_raw.fif
│10:59:17│ ⏳️ sub-155 run-01 Adding Maxwell filtered raw data to report.
│10:59:30│ ⏳️ sub-155 run-01 Adding config and sys info to report
│11:00:09│ ⏳️ sub-155 run-01 Saving report: /neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/derivatives/sub-155/meg/sub-155_task-tiwm_report.html
│11:00:14│ ✅ sub-155 run-01 Computation unnecessary (cached) …
│11:00:21│ ⏳️ sub-155 run-02 Loading reference run: 01.
│11:00:21│ ⏳️ sub-155 run-02 Applying SSS to experimental data
│11:01:28│ ⏳️ sub-155 run-02 Destination is 2.3 mm and 2.5° from the original head position
│11:01:33│ ⏳️ sub-155 run-02 Writing sub-155/meg/sub-155_task-tiwm_run-02_proc-sss_raw.fif
│11:02:41│ ⏳️ sub-155 run-02 Adding Maxwell filtered raw data to report.
│11:02:53│ ⏳️ sub-155 run-02 Saving report: /neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/derivatives/sub-155/meg/sub-155_task-tiwm_report.html
│11:03:04│ ⏳️ sub-155 run-03 Loading reference run: 01.
│11:03:04│ ⏳️ sub-155 run-03 Applying SSS to experimental data
^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[B^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A│11:04:13│ ⏳️ sub-155 run-03 Destination is 5.3 mm and 2.1° from the original head position
│11:04:18│ ⏳️ sub-155 run-03 Writing sub-155/meg/sub-155_task-tiwm_run-03_proc-sss_raw.fif
│11:05:27│ ⏳️ sub-155 run-03 Adding Maxwell filtered raw data to report.
│11:05:38│ ⏳️ sub-155 run-03 Saving report: /neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/derivatives/sub-155/meg/sub-155_task-tiwm_report.html
│11:05:49│ ⏳️ sub-155 run-04 Loading reference run: 01.
│11:05:50│ ⏳️ sub-155 run-04 Applying SSS to experimental data

@larsoner
Copy link
Member Author

Yes it's normal -- if the code of a step changes the caching function detects it and says the step should be rerun. This should be why the data quality step was skipped (we didn't change that code lately) but MF step ran (we've fixed bugs there in the last month or so). It would be dangerous / a bug if it didn't behave this way. So when you update any steps that have changed compared to your old version will need to rerun whichever steps have been modified.

@SophieHerbst
Copy link
Collaborator

@larsoner I don't find the quote but you asked how long a rerun takes if everything is cached:
1m 55s for one subject

@JD-Zhu
Copy link

JD-Zhu commented Apr 4, 2024

I think I have a related request - would it be possible to only run steps from a certain point onwards? One example scenario where this would be useful is if you obtain the intermediate processing files from a collaborator and would like to run the remaining steps.

Please let me know if this should be a separate issue instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants