Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discontinuities in NEP predicted forces #476

Open
Dankomaister opened this issue Aug 22, 2023 · 17 comments
Open

Discontinuities in NEP predicted forces #476

Dankomaister opened this issue Aug 22, 2023 · 17 comments

Comments

@Dankomaister
Copy link

Hi I’m training a NEP to study ferrocene decomposition.
As a test of my current model, I checked the PES for the dissociation of one of the cyclopentadienyl rings, like this
decomp

While the energy looks okay as seen here for 4 NEP models compared to DFT.
decomp

I found that there is a strange discontinuity in the force for a separation distance of around 3Å, as seen here.
decomp

Is this to be expected? I had assumed that NEP would give smooth continuous forces.

As a side note, I have seen this for other systems as well.
Here is an example of the force between the layers of AB stacked bilayer graphene as a function of a separation distance.
The same strange discontinuities in the force can be seen here.
decomp

Is this a bug? or is there anything that can be done to mitigate this?

/Daniel

@brucefan1983
Copy link
Owner

I think your results are not uncommon. For a carbon NEP model trained against the GAP2020 dataset (https://gitlab.com/brucefan1983/nep-data/-/tree/main/C/C_GAP2020), I got the following bilayer graphene exfoliation energy and force curves:
image

This NEP model is actually quite good compared to the literature, at least in terms of the achieved energy and force RMSEs. However, it is clearly not very good in terms of the graphite binding energy.

I am working on adding the DFT-D3 dispersion correction to NEP, which I think will mitigate this problem. In this way, we only need to train a NEP model with a relatively short cutoff against pure DFT data (without dispersion correction).

Can I ask if you have added dispersion correction to the DFT reference data for your ferrocene decomposition problem?

@brucefan1983
Copy link
Owner

And about your question on the "discontinuity", I think there is no bug. It is just inaccuracy force prediction.

@Dankomaister
Copy link
Author

I think your results are not uncommon. For a carbon NEP model trained against the GAP2020 dataset (https://gitlab.com/brucefan1983/nep-data/-/tree/main/C/C_GAP2020), I got the following bilayer graphene exfoliation energy and force curves: image

This NEP model is actually quite good compared to the literature, at least in terms of the achieved energy and force RMSEs. However, it is clearly not very good in terms of the graphite binding energy.

I am working on adding the DFT-D3 dispersion correction to NEP, which I think will mitigate this problem. In this way, we only need to train a NEP model with a relatively short cutoff against pure DFT data (without dispersion correction).

Can I ask if you have added dispersion correction to the DFT reference data for your ferrocene decomposition problem?

Ah that is interesting, have you tried to sweep the interlayer distance with smaller steps to see if you also get this discontinuity in the force?

For MLFFs to learn accurate dispersion interactions one need a bit specialized training data (GAP-20 does not have good data for learning dispersion interactions), but it is doable.

Including DFT-D3 in NEP I think is a good idea as long as it doesn’t slow things down too much.
Is your plan to allow for optimization of the DFT-D3 parameters during training?

That would be great since then we could add the DFT-D3 correction when training on data labeled with other dispersion corrections or other DFT functionals. I suppose it would also be a good idea if one could manually set the DFT-D3 parameters. Then one can label the data with simple PBE (or other functionals for which DFT-D3 parameters exists) and then just add the DFT-D3 correction post training.

For the ferrocene decomposition my training data is labeled using nonlocal vdW-DF functionals, in this case I’m using rev-vdW-DF2.

@Dankomaister
Copy link
Author

And about your question on the "discontinuity", I think there is no bug. It is just inaccuracy force prediction.

I do not agree that discontinuity in the force is “just inaccuracy force prediction”, discontinuities is far more serious.

Let me show an example of this using DeePMD.

The AB stacked bilayer graphene interlayer force curve I showed in my previous post comes from me training a NEP model a DeePMD training set I created for growing nanotubes (https://doi.org/10.21203/rs.3.rs-3197610/v1). Similar discontinuities (noise?) in the force prediction for DeePMD trained on this dataset can be seen for AB stacked bilayer graphene

Fig. 1

With DeePMD however it is trivial to see what gives rise to these discontinuities.

The result in Fig. 1 was obtained for a DeePMD potential trained with rcut_smth=5.0 Å a very large value close to the real cutoff, rcut=5.5 Å. While mathematical the weighting function

still mathematically decreases smoothly to zero for this case. In practice the very sharp decrease over 0.5 Å causes the weighting function to act more like a step function (at least numerical) which introduces discontinuities in the MLFF.

Reducing rcut_smth results in smooth continues forces as can be seen here for a DeePMD potential trained on the same data with the same parameter except rcut_smth=0.5 Å

Fig. 2

Natural running MD with the DeePMD potential shown in Fig. 1 will give rise to more integration error compared to that of Fig. 2.

While I’m not as familiar with the inner workings of NEP, I can’t find any obvious parameter that might result in similar numerically introduces discontinuities in NEP? It is slightly worrying to me that the discontinuity I see for the ferrocene decomposition is close to 3.5 Å which is the angular cutoff I used in NEP. Does the value from the angular descriptors decrease smoothly towards zero as well as the radial ones in NEP?

/Daniel

@brucefan1983
Copy link
Owner

I don't have denser data for the time being. I feel that with NEP, the "discontinuity" can be alleviated by increasing the angular cutoff and also optionally increase the regularization (lambda_1 and lambda_2).

@Dankomaister
Copy link
Author

Thanks for the suggestions! I will train some more models with different parameters to check.
Just for reference, here are the NEP parameters I used

version		4
type		3 Fe C H
cutoff		5.5 3.5
n_max		8 8
basis_size	6 6
l_max		4 2 0
neuron		54
lambda_e	0.5
lambda_f	1.0
lambda_v	0.01
batch		200
population	64
generation	300000

@brucefan1983
Copy link
Owner

Thanks for sharing the input parameters. I think you can try to increase the angular cutoff and observe the changes.

Also, I would use larger basis_size, whose default values are increased from 8 8 to 12 12. Your values of 6 6 seem to be quite small, which could limit the training accuracy.

Another note is that the current default regularization is a bit too small. I will increase the default values but you can try to set

lambda_1 0.15
lambda_2 0.15

for your 3-component systems.

@brucefan1983
Copy link
Owner

I have already increased the default weights for regularization. So it will be fine if you use the updated master branch with the default regularization (that is, leaving $\lambda_1$ and $\lambda_2$ unspecified).

@brucefan1983
Copy link
Owner

And for your interest, DFT-D3 is already available in GPUMD #482

@Dankomaister
Copy link
Author

DFT-D3

Great work adding DFT-D3 💪
I have to find some time to try it.

@Dankomaister
Copy link
Author

Okay, I had some time to try a few more things.
Using the NEP parameters in my previous post as a base I tried the following.

Increase the regularization

lambda_1      0.5
lambda_2      0.5

gave the following result

Decreasing the the regularization

lambda_1      0.0
lambda_2      0.0

gave the following result

I also tried to increase the cutoff for the angular descriptors

cutoff            5.5 5.5

which gave the following result

This gave the best results so far, but it is not a good solution since it makes the MLFF much slower. But it gave me the idea to try to reduce the number of angular descriptors

n_max           8 0
basis_size      6 0

which resulted in

This is interesting, obviously without angular descriptors the accuracy is much worse but perhaps more smooth? Could what we are seeing here be a strange interplay between the angular and radial cutoffs?

/Daniel

@brucefan1983
Copy link
Owner

I do not think the curve without angular descriptor components is more smooth.

For your "best results" with 5.5 A angular cutoff, what is the regularization?

@hityingph
Copy link
Collaborator

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.

39e06104f0db9c773c6ce5af9b87076

@Dankomaister
Copy link
Author

I do not think the curve without angular descriptor components is more smooth.

For your "best results" with 5.5 A angular cutoff, what is the regularization?

I used the same NEP parameters as in my previous post, so regularization was the same ie default values.

@Dankomaister
Copy link
Author

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.

39e06104f0db9c773c6ce5af9b87076

Looks good! What cutoff was used for the D3 part? How did adding D3 affect the speed? Also did you have a chance to look at the force?

@brucefan1983
Copy link
Owner

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.
39e06104f0db9c773c6ce5af9b87076

Looks good! What cutoff was used for the D3 part? How did adding D3 affect the speed? Also did you have a chance to look at the force?

The cutoff for D3 part is 12 A, which already reduces the overall speed by a factor of 2. Using a longer cutoff for D3 will significantly reduce the speed. The results are now published (there are typos in the table: theree should be no minus sign in the last column of table 1): https://iopscience.iop.org/article/10.1088/1361-648X/ad1278/meta

We have not looked at the force, but the D3 part has very smooth energy and force, it should be fine.

@brucefan1983
Copy link
Owner

I believe for accurate description of vdw interactions, NEP + D3 is the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants