Discontinuities in NEP predicted forces #476

Dankomaister · 2023-08-22T07:53:39Z

Hi I’m training a NEP to study ferrocene decomposition.
As a test of my current model, I checked the PES for the dissociation of one of the cyclopentadienyl rings, like this

While the energy looks okay as seen here for 4 NEP models compared to DFT.

I found that there is a strange discontinuity in the force for a separation distance of around 3Å, as seen here.

Is this to be expected? I had assumed that NEP would give smooth continuous forces.

As a side note, I have seen this for other systems as well.
Here is an example of the force between the layers of AB stacked bilayer graphene as a function of a separation distance.
The same strange discontinuities in the force can be seen here.

Is this a bug? or is there anything that can be done to mitigate this?

/Daniel

brucefan1983 · 2023-08-22T17:18:17Z

I think your results are not uncommon. For a carbon NEP model trained against the GAP2020 dataset (https://gitlab.com/brucefan1983/nep-data/-/tree/main/C/C_GAP2020), I got the following bilayer graphene exfoliation energy and force curves:

This NEP model is actually quite good compared to the literature, at least in terms of the achieved energy and force RMSEs. However, it is clearly not very good in terms of the graphite binding energy.

I am working on adding the DFT-D3 dispersion correction to NEP, which I think will mitigate this problem. In this way, we only need to train a NEP model with a relatively short cutoff against pure DFT data (without dispersion correction).

Can I ask if you have added dispersion correction to the DFT reference data for your ferrocene decomposition problem?

brucefan1983 · 2023-08-22T17:25:15Z

And about your question on the "discontinuity", I think there is no bug. It is just inaccuracy force prediction.

Dankomaister · 2023-08-22T23:52:28Z

I think your results are not uncommon. For a carbon NEP model trained against the GAP2020 dataset (https://gitlab.com/brucefan1983/nep-data/-/tree/main/C/C_GAP2020), I got the following bilayer graphene exfoliation energy and force curves:

This NEP model is actually quite good compared to the literature, at least in terms of the achieved energy and force RMSEs. However, it is clearly not very good in terms of the graphite binding energy.

I am working on adding the DFT-D3 dispersion correction to NEP, which I think will mitigate this problem. In this way, we only need to train a NEP model with a relatively short cutoff against pure DFT data (without dispersion correction).

Can I ask if you have added dispersion correction to the DFT reference data for your ferrocene decomposition problem?

Ah that is interesting, have you tried to sweep the interlayer distance with smaller steps to see if you also get this discontinuity in the force?

For MLFFs to learn accurate dispersion interactions one need a bit specialized training data (GAP-20 does not have good data for learning dispersion interactions), but it is doable.

Including DFT-D3 in NEP I think is a good idea as long as it doesn’t slow things down too much.
Is your plan to allow for optimization of the DFT-D3 parameters during training?

That would be great since then we could add the DFT-D3 correction when training on data labeled with other dispersion corrections or other DFT functionals. I suppose it would also be a good idea if one could manually set the DFT-D3 parameters. Then one can label the data with simple PBE (or other functionals for which DFT-D3 parameters exists) and then just add the DFT-D3 correction post training.

For the ferrocene decomposition my training data is labeled using nonlocal vdW-DF functionals, in this case I’m using rev-vdW-DF2.

Dankomaister · 2023-08-23T00:55:51Z

And about your question on the "discontinuity", I think there is no bug. It is just inaccuracy force prediction.

I do not agree that discontinuity in the force is “just inaccuracy force prediction”, discontinuities is far more serious.

Let me show an example of this using DeePMD.

The AB stacked bilayer graphene interlayer force curve I showed in my previous post comes from me training a NEP model a DeePMD training set I created for growing nanotubes (https://doi.org/10.21203/rs.3.rs-3197610/v1). Similar discontinuities (noise?) in the force prediction for DeePMD trained on this dataset can be seen for AB stacked bilayer graphene

Fig. 1

With DeePMD however it is trivial to see what gives rise to these discontinuities.

The result in Fig. 1 was obtained for a DeePMD potential trained with rcut_smth=5.0 Å a very large value close to the real cutoff, rcut=5.5 Å. While mathematical the weighting function

still mathematically decreases smoothly to zero for this case. In practice the very sharp decrease over 0.5 Å causes the weighting function to act more like a step function (at least numerical) which introduces discontinuities in the MLFF.

Reducing rcut_smth results in smooth continues forces as can be seen here for a DeePMD potential trained on the same data with the same parameter except rcut_smth=0.5 Å

Fig. 2

Natural running MD with the DeePMD potential shown in Fig. 1 will give rise to more integration error compared to that of Fig. 2.

While I’m not as familiar with the inner workings of NEP, I can’t find any obvious parameter that might result in similar numerically introduces discontinuities in NEP? It is slightly worrying to me that the discontinuity I see for the ferrocene decomposition is close to 3.5 Å which is the angular cutoff I used in NEP. Does the value from the angular descriptors decrease smoothly towards zero as well as the radial ones in NEP?

/Daniel

brucefan1983 · 2023-08-23T06:09:18Z

I don't have denser data for the time being. I feel that with NEP, the "discontinuity" can be alleviated by increasing the angular cutoff and also optionally increase the regularization (lambda_1 and lambda_2).

Dankomaister · 2023-08-23T06:19:35Z

Thanks for the suggestions! I will train some more models with different parameters to check.
Just for reference, here are the NEP parameters I used

version		4
type		3 Fe C H
cutoff		5.5 3.5
n_max		8 8
basis_size	6 6
l_max		4 2 0
neuron		54
lambda_e	0.5
lambda_f	1.0
lambda_v	0.01
batch		200
population	64
generation	300000

brucefan1983 · 2023-08-23T14:32:08Z

Thanks for sharing the input parameters. I think you can try to increase the angular cutoff and observe the changes.

Also, I would use larger basis_size, whose default values are increased from 8 8 to 12 12. Your values of 6 6 seem to be quite small, which could limit the training accuracy.

Another note is that the current default regularization is a bit too small. I will increase the default values but you can try to set

lambda_1 0.15
lambda_2 0.15

for your 3-component systems.

brucefan1983 · 2023-08-28T14:59:06Z

I have already increased the default weights for regularization. So it will be fine if you use the updated master branch with the default regularization (that is, leaving $\lambda_1$ and $\lambda_2$ unspecified).

brucefan1983 · 2023-08-30T19:12:01Z

And for your interest, DFT-D3 is already available in GPUMD #482

Dankomaister · 2023-09-06T06:13:00Z

DFT-D3

Great work adding DFT-D3 💪
I have to find some time to try it.

Dankomaister · 2023-09-06T06:48:22Z

Okay, I had some time to try a few more things.
Using the NEP parameters in my previous post as a base I tried the following.

Increase the regularization

lambda_1      0.5
lambda_2      0.5

gave the following result

Decreasing the the regularization

lambda_1      0.0
lambda_2      0.0

gave the following result

I also tried to increase the cutoff for the angular descriptors

cutoff            5.5 5.5

which gave the following result

This gave the best results so far, but it is not a good solution since it makes the MLFF much slower. But it gave me the idea to try to reduce the number of angular descriptors

n_max           8 0
basis_size      6 0

which resulted in

This is interesting, obviously without angular descriptors the accuracy is much worse but perhaps more smooth? Could what we are seeing here be a strange interplay between the angular and radial cutoffs?

/Daniel

brucefan1983 · 2023-09-06T10:34:57Z

I do not think the curve without angular descriptor components is more smooth.

For your "best results" with 5.5 A angular cutoff, what is the regularization?

hityingph · 2023-09-06T11:59:31Z

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.

Dankomaister · 2023-09-06T23:35:31Z

I do not think the curve without angular descriptor components is more smooth.

For your "best results" with 5.5 A angular cutoff, what is the regularization?

I used the same NEP parameters as in my previous post, so regularization was the same ie default values.

Dankomaister · 2023-09-06T23:38:26Z

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.

Looks good! What cutoff was used for the D3 part? How did adding D3 affect the speed? Also did you have a chance to look at the force?

brucefan1983 · 2024-01-31T15:29:53Z

@Dankomaister Hi Daniel, we have tested the NEP-D3 on bilayer graphene. As shown in the following figure, we trained the NEP model using radius and angular cutoff of 4.5 angstrom, and the binding energy curve predicted by combing the NEP model and D3 is identical to the PBE-D3 counterpart.

Looks good! What cutoff was used for the D3 part? How did adding D3 affect the speed? Also did you have a chance to look at the force?

The cutoff for D3 part is 12 A, which already reduces the overall speed by a factor of 2. Using a longer cutoff for D3 will significantly reduce the speed. The results are now published (there are typos in the table: theree should be no minus sign in the last column of table 1): https://iopscience.iop.org/article/10.1088/1361-648X/ad1278/meta

We have not looked at the force, but the D3 part has very smooth energy and force, it should be fine.

brucefan1983 · 2024-01-31T15:32:20Z

I believe for accurate description of vdw interactions, NEP + D3 is the way to go.

brucefan1983 added the investigate label Sep 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discontinuities in NEP predicted forces #476

Discontinuities in NEP predicted forces #476

Dankomaister commented Aug 22, 2023

brucefan1983 commented Aug 22, 2023

brucefan1983 commented Aug 22, 2023

Dankomaister commented Aug 22, 2023

Dankomaister commented Aug 23, 2023

brucefan1983 commented Aug 23, 2023

Dankomaister commented Aug 23, 2023

brucefan1983 commented Aug 23, 2023

brucefan1983 commented Aug 28, 2023

brucefan1983 commented Aug 30, 2023

Dankomaister commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

brucefan1983 commented Sep 6, 2023

hityingph commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

brucefan1983 commented Jan 31, 2024

brucefan1983 commented Jan 31, 2024

Discontinuities in NEP predicted forces #476

Discontinuities in NEP predicted forces #476

Comments

Dankomaister commented Aug 22, 2023

brucefan1983 commented Aug 22, 2023

brucefan1983 commented Aug 22, 2023

Dankomaister commented Aug 22, 2023

Dankomaister commented Aug 23, 2023

brucefan1983 commented Aug 23, 2023

Dankomaister commented Aug 23, 2023

brucefan1983 commented Aug 23, 2023

brucefan1983 commented Aug 28, 2023

brucefan1983 commented Aug 30, 2023

Dankomaister commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

brucefan1983 commented Sep 6, 2023

hityingph commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

Dankomaister commented Sep 6, 2023

brucefan1983 commented Jan 31, 2024

brucefan1983 commented Jan 31, 2024