Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix restart #108

Open
ElliottKasoar opened this issue Apr 10, 2024 · 6 comments
Open

Fix restart #108

ElliottKasoar opened this issue Apr 10, 2024 · 6 comments
Assignees
Labels
bug Something isn't working janus

Comments

@ElliottKasoar
Copy link
Member

ElliottKasoar commented Apr 10, 2024

Restart correctly identifies the step from the data file, but I'm not sure it actually use the restart file to set the structure.

We can probably update the step-reading to use the data loader too.

Restart also needs some reworking for heating from #107

@ElliottKasoar ElliottKasoar added bug Something isn't working janus labels Apr 10, 2024
@ElliottKasoar ElliottKasoar self-assigned this Apr 10, 2024
@harveydevereux
Copy link
Contributor

Hi @ElliottKasoar, sorry to piggy back on this issue, but I was just thinking about restart and heating, when designing a workflow using the heating.

It seems that restart loads up the data file as you say but not explicitly the .xyz that gets written by _write_restart?

My assumption now was if I pass in --struct NAME_OF_LAST_RESTART_FILE --stats-file NAME_OF_LAST_DATA_FILE, then janus would truely restart the sim, as I have explicitly set the structure? Which is a way round the issue.

@alinelena and I were just discussing also the possibility of having a "final" restart file that has a known filename to aid automation of workflows.

Currently If I change the final temperature any follow on workflow will have to be changed to account for the new file name that depends on the step (I think). And it may not be the final step either if the user does not set the interval correctly. e.g.restart_file = f"{self.restart_stem}-{step}.xyz".

I might even suggest we could have an option to explicitly override the restart file name, so we always know it would be say restart_file = f"{self.SOME_USER_SPECIFIED_NAME}.xyz", even though that means overwriting it.

I'd try to do it now, but currently catching up on how the attachments work self.dyn.attach(self._write_restart, interval=self.restart_every).

Not sure what the best course of action is.

@ElliottKasoar
Copy link
Member Author

Actually revisiting this, it's while it may not be entirely intuitive, I think it currently works as I originally intended for normal MD.

In the original proof of concept, there's a separate auto restart option, which I didn't feel essential for the initial implementation. It's probably worth adding that option back in, but I'm not sure it's necessarily a problem that by default it relies on the user specifying the correct files.

My assumption now was if I pass in --struct NAME_OF_LAST_RESTART_FILE --stats-file NAME_OF_LAST_DATA_FILE, then janus would truely restart the sim, as I have explicitly set the structure? Which is a way round the issue.

Yes, I think that's right.

@alinelena and I were just discussing also the possibility of having a "final" restart file that has a known filename to aid automation of workflows.

Currently If I change the final temperature any follow on workflow will have to be changed to account for the new file name that depends on the step (I think). And it may not be the final step either if the user does not set the interval correctly. e.g.restart_file = f"{self.restart_stem}-{step}.xyz".

I might even suggest we could have an option to explicitly override the restart file name, so we always know it would be say restart_file = f"{self.SOME_USER_SPECIFIED_NAME}.xyz", even though that means overwriting it.

I'd try to do it now, but currently catching up on how the attachments work self.dyn.attach(self._write_restart, interval=self.restart_every).

Not sure what the best course of action is.

I'll have to think about this more/discuss it when I have more time, but I think it's useful to keep the step in the file names, as otherwise it's not necessarily obvious where you're picking up from.

Assuming we keep multiple restart files, I'm not sure there's an obvious way of naming the others anyway?

@alinelena
Copy link
Member

the idea is not to change the restart files but add another file that is the final frame, of the simulation or of the T for a ramp.

@harveydevereux
Copy link
Contributor

harveydevereux commented Apr 16, 2024

As Alin said yes, keeping the file names so we know what the restart is for is definitely important, I think you are right on that Elliott.

Alin and I iterated a little on this, culminating in this main...harveydevereux:janus-core:final_restart

This would leave a work directory with these files, where final is an end of heating step dump, and res are the usual restarts/stats/traj as before (but now with temperature ranges if temp_start and temp_end are given, it would be just temp if no heating is asked for)

LiF-nvt-final-T100.0.xyz  LiF-nvt-final-T260.0.xyz    LiF-nvt-res-T240.0-120.xyz
LiF-nvt-final-T120.0.xyz  LiF-nvt-final-T280.0.xyz    LiF-nvt-res-T280.0-140.xyz
LiF-nvt-final-T140.0.xyz  LiF-nvt-final-T300.0.xyz    LiF-nvt-res-T40.0-20.xyz
LiF-nvt-final-T160.0.xyz  LiF-nvt-final-T40.0.xyz     LiF-nvt-res-T80.0-40.xyz
LiF-nvt-final-T180.0.xyz  LiF-nvt-final-T60.0.xyz     LiF-nvt-T20.0-T300.0-stats.dat
LiF-nvt-final-T200.0.xyz  LiF-nvt-final-T80.0.xyz     LiF-nvt-T20.0-T300.0-traj.xyz
LiF-nvt-final-T20.0.xyz   LiF-nvt-res-T120.0-60.xyz   md.log
LiF-nvt-final-T220.0.xyz  LiF-nvt-res-T160.0-80.xyz   md_summary.yml
LiF-nvt-final-T240.0.xyz  LiF-nvt-res-T200.0-100.xyz

when called as so janus md --ensemble nvt --struct ../LiF.cif --temp-start 20 --temp-end 300 --temp-step 20 --temp-time 10 --device gpu --restart-every 20

Whether that is the best way forward, I'm not sure. This might complicate auto restart though, a lot, I expect.

@alinelena
Copy link
Member

i think we start to few issues in here.

  1. have final (shall be an issue about this)
  2. be sure restart works and works for md(shall work in general is simple, check the latest restart file, determine the timestep and continue) and heating(same as before but check also the T)
  3. rename the restart stem and filestem to take account of the heating.(there is an issue about this)
    1-3 will be addressed by @harveydevereux in a pr to come soon

@ElliottKasoar
Copy link
Member Author

ElliottKasoar commented Apr 29, 2024

So we now have the final file saved, and have updated the file names to account for heating and/or MD, but I think we're still missing:

  • Restart at the correct part of a temperature ramp, when heating
  • Check the restarting works in terms of steps/MD (I think the steps is already tested, and otherwise it should start from whatever structure we give it, so hopefully this is fine)
  • Add auto-restart/restart file detection (optionally, or automatic if we're restarting unless overwritten?)

Probably best to split the first two points into one issue, and the last into another, unless anyone disagrees with what needs doing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working janus
Projects
None yet
Development

No branches or pull requests

3 participants