Implement MD workflows #1672

tomdemeyere · 2024-02-07T22:44:58Z

Summary of Changes

Closes #1577.

Before going further a little consultation:

MD users use workflows all the time, just for the NVE example in this commit there is pretty much a thousand ways you can run that; doing a first NVT to relax to the desired temperature, or a first NPT to relax to the desired density, etc...

For example, should we have one NVT workflow with a parameter "thermostat", or one workflow per thermostat?
If you want things to be in recipe.common, atoms will have to be sent in with a calculator already attached, is that what you want?
From what I understand you want to keep things simple (not something like the code here), users then build their workflows to get what they want.
In any case, for now doing these kinds of workflows is difficult, because it is generally hard to transfer results between ASE calculators, which will always happen between runs, this is critical since MD needs momenta and forces to ensure proper continuation. (This is a problem I would like fixed above since it also causes problems when restarting optimization etc)

Checklist

I have read the "Guidelines" section of the contributing guide. Don't lie! 😉
My PR is on a custom branch and is not named main.
I have used black, isort, and ruff as described in the style guide.
I have added relevant, comprehensive unit tests.

buildbot-princeton · 2024-02-07T22:45:01Z

Can one of the admins verify this patch?

Andrew-S-Rosen · 2024-02-07T22:51:12Z

@tomdemeyere: Happy to respond to each question but before I do, I just wanted to check that this is the full commit you wanted to share --- just the args/kwargs, right?

(In short, I really would love to have MD flows and am willing to work together to figure out how to get this done the best way possible)

codecov · 2024-02-07T22:54:27Z

Codecov Report

Attention: Patch coverage is 98.18182% with 2 lines in your changes missing coverage. Please review.

Project coverage is 98.99%. Comparing base (a73891d) to head (25cb740).

Files	Patch %	Lines
src/quacc/runners/ase.py	95.91%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1672      +/-   ##
==========================================
- Coverage   99.01%   98.99%   -0.03%     
==========================================
  Files          81       83       +2     
  Lines        3367     3477     +110     
==========================================
+ Hits         3334     3442     +108     
- Misses         33       35       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tomdemeyere · 2024-02-07T23:27:01Z

@tomdemeyere: Happy to respond to each question but before I do, I just wanted to check that this is the full commit you wanted to share --- just the args/kwargs, right?

(In short, I really would love to have MD flows and am willing to work together to figure out how to get this done the best way possible)

Yes, before doing more I wanted to make sure we are on the same page.

I guess my main question here is: how to make these flows common to all codes? Should they call code specific MD jobs? Or, should a calculator be attached to the Atoms object being sent to the MD flow?

Once I know this we can work on details

Andrew-S-Rosen · 2024-02-08T00:45:07Z

I guess my main question here is: how to make these flows common to all codes? Should they call code specific MD jobs? Or, should a calculator be attached to the Atoms object being sent to the MD flow?

@tomdemeyere: Good question.

Personally, I think what would likely make the most sense is to make a run_md function in quacc.runners.ase that would take an Atoms object with an attached calculator just like the other methods in there. You likely were aware of that already.

From there, we get to the recipes, and my answer is: "it depends." If it is just a simple @job (i.e. not a @flow), then this is no different than a relaxation job --- one would just be calling run_md instead of run_calc or run_ase_opt in a given code's module (e.g. quacc.recipes.emt). Since there wouldn't be much logic beyond defining default arguments and calling run_md, this is fine and doesn't introduce much duplication. I would treat the individual MD jobs in the same way as a relaxation --- we have a runner, which is called in the various recipes where it is appropriate. It does mean that there would be multiple MD modules, but we would need that anyway because every code will have different parameters (defaults) that are needed to properly run an MD calculation.

If we are talking about a @flow where there are multiple Jobs being stitched together, and we anticipate this pattern (i.e. the workflow DAG) to be calculator-agnostic, then we'll put that in common. This is the philosophy behind the existing common recipes. But we would still need individual Jobs for each code. In general, users aren't expected to call the common workflow directly. Aside from usability aspects, there are some other nuances about (de)serialization needed by workflow engines for why I say this, but I won't bore you with those details here.

To get started, I would suggest making quacc.runners.ase.run_md and then a simple demo for a cheap-to-run calculator, like EMT or LJ. That will likely be illustrative and help guide the design further.

I should also note that we can always refactor later if there is something we can refactor. No need to prematurely optimize.

For example, should we have one NVT workflow with a parameter "thermostat", or one workflow per thermostat?

If the workflow logic is staying the same regardless of thermostat, then I would suggest one job/workflow where thermostat would be a keyword argument with a sensible default.

If you want things to be in recipe.common, atoms will have to be sent in with a calculator already attached, is that what you want?

See comment above. If we are talking about a @job, then that would call run_md and be specific for the code since we need to set some sensible default parameters. If it's a @flow pattern, that can go in common but would not require passing in Atoms objects with attached calculators --- it would require passing in one or more Jobs.

From what I understand you want to keep things simple (not something like the code here), users then build their workflows to get what they want.

Hard for me to tell from the commit you shared here, but I am not opposed to complexity. In fact, we should not rely on the users to build something complex. If the @flow pattern is "trivial", like a relax then an MD run, it is debatable if we should make that its own @flow or not (honestly, it can go either way). But if it's more complex, for sure we should.

In any case, for now doing these kinds of workflows is difficult, because it is generally hard to transfer results between ASE calculators, which will always happen between runs, this is critical since MD needs momenta and forces to ensure proper continuation. (This is a problem I would like fixed above since it also causes problems when restarting optimization etc)

This is a good point. However, I think it is solvable. Let's circle back to this when the time is right.

tomdemeyere · 2024-02-08T15:54:41Z

@Andrew-S-Rosen Thanks for all these details. I will come back to it at some point

tomdemeyere · 2024-02-14T08:15:23Z

@Andrew-S-Rosen I think this is now a good point to have your input

Andrew-S-Rosen

Thank you for kicking this off, @tomdemeyere! Really appreciate the contribution!

This is looking good. I have some comments that are mostly related to some minor restructuring.

Starting with this isolated EMT example is very helpful for me in understanding the process and figuring out what should be refactored, so I thank you for doing that.

src/quacc/recipes/emt/core.py

Andrew-S-Rosen · 2024-02-14T19:46:30Z

src/quacc/recipes/emt/core.py

+
+    if temperature:
+        MaxwellBoltzmannDistribution(
+            atoms, temperature_K=temperature, **initial_temperature_params


With the fixcm and fixrot as dedicated keyword arguments, the **kwargs here can be more clearly described as simply being the **maxwell_boltzmann_distribution_kwargs. That is a mouthful (although nobody is calling the name directly). Could try something like **maxwell_boltzmann_kwargs. Either way, the point is that both the docstring and name would be much clearer this way. At a glance, it's not clear what an **initial_temperature_params is.

src/quacc/recipes/emt/core.py

src/quacc/runners/ase.py

src/quacc/schemas/_aliases/ase.py

src/quacc/schemas/ase.py

src/quacc/recipes/emt/core.py

tomdemeyere · 2024-02-15T08:39:30Z

Thank you for kicking this off, @tomdemeyere! Really appreciate the contribution!

This is looking good. I have some comments that are mostly related to some minor restructuring.

Starting with this isolated EMT example is very helpful for me in understanding the process and figuring out what should be refactored, so I thank you for doing that.

I see that your comment is different than the one I received by email from git. But yes, originally my plan was to have these "_base" functions. Either per calculator or in common like you previously said.

After discussing that along with other details I was then planning to introduce recipes for the other ensembles (NVT, NPT).

Didn't see your comments originally (github mobile) will work on it at some point

Andrew-S-Rosen · 2024-02-15T11:41:30Z

I see that your comment is different than the one I received by email from git. But yes, originally my plan was to have these "_base" functions. Either per calculator or in common like you previously said.

Apologies about the edits! I was trying to sort out where to place them and was going back-and-forth so decided to just save it for a future discussion to avoid confusion. But you got the right idea anyway.

After discussing that along with other details I was then planning to introduce recipes for the other ensembles (NVT, NPT).

That would be great!

Didn't see your comments originally (github mobile) will work on it at some point

Thank you!

src/quacc/recipes/emt/core.py

tomdemeyere · 2024-02-15T22:25:29Z

Few comments:

Without run_kwargs there is no way to pass things to dyn.run()
There is still a check_convergence, although it does not make much sense for MD. By definition you reached the max number of steps, but there is a case where it makes sense:

We are currently working on a error-free quacc as part of another project. We make sure that quacc always reaches the point of writing to the database: it writes the errors that occurred, along with the available results, if any (the errors might happen at some point during the MD, previous results should be returned, actually in some cases, errors might even be part of your journey (although this might not be what Quacc was originally made for)). Currently if one is running a bunch of simulations, if errors occur, there will be a bunch of directories with no easy way to figure things out.

Because you said you might be interested in error handling I am leaving that there.

I changed how temperature/time etc... is written to results, feel free to tell me what you think

Andrew-S-Rosen · 2024-02-15T22:52:38Z

Without run_kwargs there is no way to pass things to dyn.run()

The fmax and steps are already passed in quacc.runners.ase.run_opt, and .run() doesn't take any other keyword arguments, so we don't need it.

There is still a check_convergence, although it does not make much sense for MD. By definition you reached the max number of steps, but there is a case where it makes sense:

I agree that a check_convergence doesn't make much sense for this job type. We should remove that logic entirely.

We are currently working on a error-free quacc as part of another project. We make sure that quacc always reaches the point of writing to the database: it writes the errors that occurred, along with the available results,

One could catch the typical RuntimeError errors when ASE is called and try to give the user some flexibility via the global settings about how to handle things with the CHECK_CONVERGENCE and/or some other global parameter. But of course, feel free to continue doing what you're doing!

Currently if one is running a bunch of simulations, if errors occur, there will be a bunch of directories with no easy way to figure things out.

Indeed. This is a limitation of Parsl/Dask, which is really best thought of a task manager. Other true workflow tools like Prefect or Covalent or FireWorks have a separate database for the task metadata and would highlight when/where the job fails. But since Parsl does not have a task database of its own, there is no mechanism to easily store that info. The recommended approach would be to launch your calculations in a for loop and log any errors from the Parsl AppFuture. But of course, your mechanism is fine as well.

I changed how temperature/time etc... is written to results, feel free to tell me what you think

I like that a lot more!

tomdemeyere · 2024-04-10T00:09:40Z

This should be close to go, there is still something essential before being able to run MD correctly: restarts. They are still difficult to do (impossible in some cases), but that's an ASE problem. I suspect this will take ages to fix; for now I deleted everything related to restart while we wait to see what solutions will get accepted upstream (if any).

I looked at automate2, there are indeed some interesting things, like the possibility to specify a temperature/pressure gradient. I will charge back with my idea of "patches" to accomplish that, this would look like:

steps = np.arange(1000, 10000, 1)
temperatures = function_of_steps(steps)

my_temp_gradient = TemperatureGradient(steps, temperatures)
my_file_copier = FileCopier(interval = 10)
my_press_gradient = PressureGradient(...)

md_job(... patches = [my_temp_gradient, my_file_copier, my_pressure_gradient])

This is achievable by tuning the generator as you did previously (although it might get packed), for some reason I love this "patching" concept 😅. People could fine-tune their MD, picking what triggers when, Quacc could even have a section in the documentation explaining how to do complex things as we already discussed. Anyway, daydreaming here 🙃

Andrew-S-Rosen · 2024-04-10T05:00:00Z

Thanks for your persistence with this! I will try to look at this within the next few days (please ping me if I forget). I'm definitely interested to chat more about the patches idea here; perhaps within the context of MD, it could be an interesting proof-of-concept to think about. I'll revisit this when I have the time to give this a proper review!

…rkflow

tomdemeyere · 2024-05-23T11:48:53Z

Some news on this? Just to know if it is waiting on https://gitlab.com/ase/ase/-/merge_requests/3310

initial

b23a93f

Andrew-S-Rosen added the New Recipe label Feb 9, 2024

tomdemeyere added 5 commits February 11, 2024 21:14

fix

3e332dc

Merge remote-tracking branch 'upstream/main' into md_workflow

5cbeaea

working

c736ac3

working

6c36678

working

42dcb82

better tests

f806b57

Andrew-S-Rosen reviewed Feb 14, 2024

View reviewed changes

Merge branch 'main' into md_workflow

97d1511

Andrew-S-Rosen reviewed Feb 15, 2024

View reviewed changes

src/quacc/recipes/emt/core.py Outdated Show resolved Hide resolved

tomdemeyere added 4 commits February 15, 2024 22:25

suggestions

1fc6914

Merge remote-tracking branch 'origin/md_workflow' into md_workflow

6a2f634

remove comment

cfd1de7

oops

07f6943

Andrew-S-Rosen added 4 commits February 15, 2024 15:22

Merge branch 'main' into md_workflow

c744af9

Fix inconsistent fix_com and fix_rot naming

dcace7e

Fix hyperlink to docs

8a2c030

Merge branch 'main' into md_workflow

f6f8b5b

Andrew-S-Rosen added 2 commits April 10, 2024 00:10

Merge branch 'main' into md_workflow

6c2cf83

Merge branch 'main' into md_workflow

5fb9bf5

Andrew-S-Rosen changed the title ~~Draft: MD workflows~~ Implement MD workflows Apr 10, 2024

Andrew-S-Rosen and others added 23 commits April 11, 2024 12:39

Some minor docstring cleanup

260cf78

Merge branch 'main' into md_workflow

a5bc2a8

pre-commit auto-fixes

d11806d

Fix type hint

dcd3b37

Fix type hints

de9ec00

Fix type hint

18561f7

pre-commit auto-fixes

b797acb

Minor test cleanup

99e91da

Refactor

eda62a2

Merge branch 'main' into md_workflow

9ecd8ce

Fix import

58f2ff8

Merge branch 'main' into md_workflow

57a589b

Merge branch 'main' into md_workflow

d29bdee

Merge branch 'main' into md_workflow

c5cdc2e

pre-commit auto-fixes

636f012

fix

2469065

Merge branch 'md_workflow' of github.com:tomdemeyere/quacc into md_wo…

f946399

…rkflow

update docstring

5367a45

pre-commit auto-fixes

ca2b1e5

Rename md_units --> convert_md_units

e1c4e48

Merge branch 'md_workflow' of github.com:tomdemeyere/quacc into md_wo…

475ccef

…rkflow

pre-commit auto-fixes

d5d9f15

Merge branch 'main' into md_workflow

5dbdfa9

Merge branch 'main' into md_workflow

25cb740

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MD workflows #1672

Implement MD workflows #1672

tomdemeyere commented Feb 7, 2024 •

edited by Andrew-S-Rosen

buildbot-princeton commented Feb 7, 2024

Andrew-S-Rosen commented Feb 7, 2024 •

edited

codecov bot commented Feb 7, 2024 •

edited

tomdemeyere commented Feb 7, 2024

Andrew-S-Rosen commented Feb 8, 2024 •

edited

tomdemeyere commented Feb 8, 2024

tomdemeyere commented Feb 14, 2024

Andrew-S-Rosen left a comment •

edited

Andrew-S-Rosen Feb 14, 2024

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024 •

edited

tomdemeyere commented Apr 10, 2024

Andrew-S-Rosen commented Apr 10, 2024

tomdemeyere commented May 23, 2024

Implement MD workflows #1672

Are you sure you want to change the base?

Implement MD workflows #1672

Conversation

tomdemeyere commented Feb 7, 2024 • edited by Andrew-S-Rosen

Summary of Changes

Checklist

buildbot-princeton commented Feb 7, 2024

Andrew-S-Rosen commented Feb 7, 2024 • edited

codecov bot commented Feb 7, 2024 • edited

Codecov Report

tomdemeyere commented Feb 7, 2024

Andrew-S-Rosen commented Feb 8, 2024 • edited

tomdemeyere commented Feb 8, 2024

tomdemeyere commented Feb 14, 2024

Andrew-S-Rosen left a comment • edited

Choose a reason for hiding this comment

Andrew-S-Rosen Feb 14, 2024

Choose a reason for hiding this comment

tomdemeyere commented Feb 15, 2024 • edited

Andrew-S-Rosen commented Feb 15, 2024

tomdemeyere commented Feb 15, 2024 • edited

Andrew-S-Rosen commented Feb 15, 2024 • edited

tomdemeyere commented Apr 10, 2024

Andrew-S-Rosen commented Apr 10, 2024

tomdemeyere commented May 23, 2024

tomdemeyere commented Feb 7, 2024 •

edited by Andrew-S-Rosen

Andrew-S-Rosen commented Feb 7, 2024 •

edited

codecov bot commented Feb 7, 2024 •

edited

Andrew-S-Rosen commented Feb 8, 2024 •

edited

Andrew-S-Rosen left a comment •

edited

tomdemeyere commented Feb 15, 2024 •

edited

tomdemeyere commented Feb 15, 2024 •

edited

Andrew-S-Rosen commented Feb 15, 2024 •

edited