[BUG] Training using init-model gives normal lcurve and bad model #3751

zjgemi · 2024-05-06T11:46:59Z

Bug summary

In the DPGen workflow, training in iter-0 seems all right. The model trained in iter-1 (with init-model) has a large RMSE ~100meV, while the lcurve shows a better accuracy

For the worst system, the RMSE increase by a factor >30 after training of iter-1.

This phenomenon does not appear when using finetune (instead of init-model) in iter-1.

DeePMD-kit Version

stable-0411

Backend and its version

Pytorch

How did you download the software?

docker

Input Files, Running Commands, Error Log, etc.

iter1input.zip

Steps to Reproduce

bash aefcb166ade9f2faf80a15e8a6f0d0cb70a6d33a.sub

Further Information, Files, and Links

No response

zjgemi added the bug label May 6, 2024

iProzd self-assigned this May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Training using init-model gives normal lcurve and bad model #3751

[BUG] Training using init-model gives normal lcurve and bad model #3751

zjgemi commented May 6, 2024

[BUG] Training using init-model gives normal lcurve and bad model #3751

[BUG] Training using init-model gives normal lcurve and bad model #3751

Comments

zjgemi commented May 6, 2024

Bug summary

DeePMD-kit Version

Backend and its version

How did you download the software?

Input Files, Running Commands, Error Log, etc.

Steps to Reproduce

Further Information, Files, and Links