Implementation of pair_style uf3 and uf3/kk #4110

monk-04 · 2024-03-25T20:25:20Z

Summary

This pull request implements the ultra-fast forcefield (uf3) machine learned interatomic potential. Both the CPU and KOKKOS versions of uf3 are implemented.

Related Issue(s)

None

Author(s)

Ajinkya C Hire (University of Florida), email- ajinkya.hire@ufl.edu, ajinkyahire62@gmail.com
Hendrik Kraß (University of Constance)
Matthias Rupp (Luxembourg Institute of Science and Technology), email- mrupp@mrupp.info
Richard Hennig (University of Florida), email- rhennig@ufl.edu

Licensing

By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).

Backward Compatibility

Will not break backward compatibility

Implementation Notes

The correctness of the implementation was verified by comparing the lammps computed energy and forces with that calculated from the python version of uf3 found here. The current version computes two-body and three-body interactions, contributions from one-body are ignored.

Post Submission Checklist

…oved some old comments

srtee · 2024-03-27T22:50:28Z

src/ML-UF3/pair_uf3.cpp

+    jnum = numshort - 1;
+    for (jj = 0; jj < jnum; jj++) {
+      fij[0] = fji[0] = 0;
+      fij[1] = fji[1] = 0;
+      fij[2] = fji[2] = 0;
+      j = neighshort[jj];
+      jtype = type[j];
+      del_rji[0] = x[j][0] - xtmp;
+      del_rji[1] = x[j][1] - ytmp;
+      del_rji[2] = x[j][2] - ztmp;
+      rij =
+          sqrt(((del_rji[0] * del_rji[0]) + (del_rji[1] * del_rji[1]) + (del_rji[2] * del_rji[2])));
+
+      // kth atom
+      for (kk = jj + 1; kk < numshort; kk++) {
+
+        fik[0] = fki[0] = 0;
+        fik[1] = fki[1] = 0;
+        fik[2] = fki[2] = 0;
+
+        fjk[0] = fkj[0] = 0;
+        fjk[1] = fkj[1] = 0;
+        fjk[2] = fkj[2] = 0;
+
+        k = neighshort[kk];
+        ktype = type[k];
+        del_rki[0] = x[k][0] - xtmp;
+        del_rki[1] = x[k][1] - ytmp;
+        del_rki[2] = x[k][2] - ztmp;
+        rik = sqrt(
+            ((del_rki[0] * del_rki[0]) + (del_rki[1] * del_rki[1]) + (del_rki[2] * del_rki[2])));


Lots of unnecessary rij recalculations here -- see comment at line 1015

srtee · 2024-03-27T22:58:39Z

src/ML-UF3/pair_uf3.cpp

+        if (pot_3b) {
+          if (rij <= cut_3b_list[itype][jtype]) {
+            neighshort[numshort] = j;
+            if (numshort >= maxshort - 1) {
+              maxshort += maxshort / 2;
+              memory->grow(neighshort, maxshort, "pair:neighshort");
+            }
+            numshort = numshort + 1;
+          }
+        }


from what I understand, this creates an internal "shorter-neighbor list" within which to evaluate 3body interactions. To me a simpler approach would be:

something like double * rij_list = new double[inum] (but create and manage the memory in a broader context for better predictability and performance)

after calculating each rij, store it: rij_list[jj] = rij

you now have the distances on hand for 3-body interactions later on

An extra inum-worth of doubles isn't a prohibitive amount of memory -- 1000 doubles is 8 KB.

You could even afford to store the delx, dely, and delz doubles in their own arrays as well.

I agree storing an array of rij will reduce the number of computations later on.
But I also need to know the type of atoms in the 3b interaction evaluation. So I need to keep the current neighshort.

Also, in practice, the 3b cutoffs are always shorter than 2-body cutoffs (usually by a couple of angstroms). Do you still think I need to make the rij_list of the size of inum?

My main worry is a bunch of unpredictable memory realllocations happening inside the inner loop of a force calculation. The if(pot3b) ... code is being triggered nlocal * jnum times and all it takes is one processor needing to do a grow to imbalance that timestep's force calculation.

Types are straightforward to track: if jj runs from 0 to jnum, the distance rij_list[jj] belongs to particle jlist[jj] which has particle type type[jlist[jj]].

Types are straightforward to track: if jj runs from 0 to jnum, the distance rij_list[jj] belongs to particle jlist[jj] which has particle type type[jlist[jj]].

I am bit confused.
jnum is decided by the cutoff of 3body interaction and, in practice, is always less than 2body.

In neighshort, I am storing j ie jlist[jj] during the 2body loop. Not every j in 2body loop forms a valid 3body triangle.

So, if my understanding of your suggestion is correct, you are suggesting to set jnum of 3body loop equal to jnum of 3body loop(?). If that is done, a lot of iterations in the 3body loop will do nothing because of the distance cutoff checks.

I apologize if I am not understanding your suggestion correctly.

srtee · 2024-03-27T23:07:23Z

src/ML-UF3/pair_uf3.cpp

+          for (int l=0; l < n3b_coeff_matrix[key].size(); l++){
+            for (int m=0; m < n3b_coeff_matrix[key][l].size(); m++){
+              bytes += (double)2*n3b_coeff_matrix[key][l][m].size()*sizeof(double);


I don't see that n3b_coeff_matrix is used outside the initialization functions -- clearing it after initialization would save some memory (probably not much?).

I am passing n3b_coeff_matrix[key] by reference to 3b bspline object (triplet_bspline UFBS3b), which then uses it to calculate energies and forces

srtee · 2024-03-27T23:27:08Z

src/ML-UF3/uf3_pair_bspline.cpp

+double *uf3_pair_bspline::eval(double r)
+{
+
+  // Find knot starting position
+
+  int start_index=(this->*get_starting_index)(r);
+  /*if (knot_vect.front() <= r && r < knot_vect.back()) {
+    //Determine the interval for value_rij
+    for (int i = 3; i < knot_vect_size - 1; ++i) {
+      if (knot_vect[i] <= r && r < knot_vect[i + 1]) {
+        start_index = i;
+        break;
+      }
+    }
+  }*/
+
+  int knot_affect_start = start_index - 3;
+
+  double rsq = r * r;
+  double rth = rsq * r;
+
+  // Calculate energy
+
+  ret_val[0] = bspline_bases[knot_affect_start + 3].eval0(rth, rsq, r);
+  ret_val[0] += bspline_bases[knot_affect_start + 2].eval1(rth, rsq, r);
+  ret_val[0] += bspline_bases[knot_affect_start + 1].eval2(rth, rsq, r);
+  ret_val[0] += bspline_bases[knot_affect_start].eval3(rth, rsq, r);
+
+  // Calculate force
+
+  ret_val[1] = dnbspline_bases[knot_affect_start + 2].eval0(rsq, r);
+  ret_val[1] += dnbspline_bases[knot_affect_start + 1].eval1(rsq, r);
+  ret_val[1] += dnbspline_bases[knot_affect_start].eval2(rsq, r);
+
+  return ret_val;
+}


Instead of returning a pointer-to-double you can return a force double and modify the energy by reference. So the function signature would be

double uf3_pair_bspline::eval(double r, double &eng)

and your evaluations would proceed as

... eng = 0 eng += \* first energy component *\ eng += \* second energy component *\ ... f = 0 f += \* first force component *\ f += \* second force component *\ ... return f

I am not sure how much this changes performance but it will make code more readable for future maintainers.

srtee · 2024-03-27T23:33:00Z

src/ML-UF3/pair_uf3.cpp

+        double *pair_eval = UFBS2b[itype][jtype].eval(rij);
+
+        fpair = -1 * pair_eval[1] / rij;
+
+        fx = delx * fpair;
+        fy = dely * fpair;
+        fz = delz * fpair;
+
+        f[i][0] += fx;
+        f[i][1] += fy;
+        f[i][2] += fz;
+        f[j][0] -= fx;
+        f[j][1] -= fy;
+        f[j][2] -= fz;
+
+        if (eflag) evdwl = pair_eval[0];


See comment on the eval function -- it may be more readable to modify energy by reference. So the function signature would be

double CLASSNAME::eval(double r, double &eng)

and it could be called as

fpair = -1 * UFBS2b[itype][jtype].eval(rij, evdwl) / rij;

This has minor readability impact here -- later on in the 3-body functions, it is extremely useful to have the variable names in both the function definition in the spline CPPs and in the function call here, instead of having to separately document what each index in a returned double pointer means.

Should I also pass eflag to the eval function? Something like UFBS2b[itype][jtype].eval(rij, evdwl, eflag)? This will save some computations as energy evaluation is not needed on every timestep?

If the energy calculation is costly enough, then yes, you can pass eflag and let eval save some time. But you would know better whether that's worth it.

As a style thing I'd prefer to keep the reference doubles together after the value doubles:

double CLASSNAME::eval( double rij, int eflag, double& evdwl // return energy by ref ){ ...

Should I also pass eflag to the eval function? Something like UFBS2b[itype][jtype].eval(rij, evdwl, eflag)? This will save some computations as energy evaluation is not needed on every timestep?

since eflag doesn't change, the most efficient way to handle this would be to make eval a templated function and eflag a template argument. That could hoist the if statement to the outermost level and the if statement will be optimized away since you either call eval<0>() or eval<1>(). The compiler will then optimize the "if" away since the condition is a compile time constant. Check out examples in OPT or OPENMP.

srtee · 2024-03-27T23:39:08Z

src/ML-UF3/pair_uf3.cpp

+            double *triangle_eval = UFBS3b[itype][jtype][ktype].eval(rij, rik, rjk);
+
+            fij[0] = *(triangle_eval + 1) * (del_rji[0] / rij);
+            fji[0] = -fij[0];
+            fik[0] = *(triangle_eval + 2) * (del_rki[0] / rik);
+            fki[0] = -fik[0];
+            fjk[0] = *(triangle_eval + 3) * (del_rkj[0] / rjk);
+            fkj[0] = -fjk[0];
+
+            fij[1] = *(triangle_eval + 1) * (del_rji[1] / rij);
+            fji[1] = -fij[1];
+            fik[1] = *(triangle_eval + 2) * (del_rki[1] / rik);
+            fki[1] = -fik[1];
+            fjk[1] = *(triangle_eval + 3) * (del_rkj[1] / rjk);
+            fkj[1] = -fjk[1];
+
+            fij[2] = *(triangle_eval + 1) * (del_rji[2] / rij);
+            fji[2] = -fij[2];
+            fik[2] = *(triangle_eval + 2) * (del_rki[2] / rik);
+            fki[2] = -fik[2];
+            fjk[2] = *(triangle_eval + 3) * (del_rkj[2] / rjk);
+            fkj[2] = -fjk[2];
+
+            Fi[0] = fij[0] + fik[0];
+            Fi[1] = fij[1] + fik[1];
+            Fi[2] = fij[2] + fik[2];
+            f[i][0] += Fi[0];
+            f[i][1] += Fi[1];
+            f[i][2] += Fi[2];
+
+            Fj[0] = fji[0] + fjk[0];
+            Fj[1] = fji[1] + fjk[1];
+            Fj[2] = fji[2] + fjk[2];
+            f[j][0] += Fj[0];
+            f[j][1] += Fj[1];
+            f[j][2] += Fj[2];
+
+            Fk[0] = fki[0] + fkj[0];
+            Fk[1] = fki[1] + fkj[1];
+            Fk[2] = fki[2] + fkj[2];
+            f[k][0] += Fk[0];
+            f[k][1] += Fk[1];
+            f[k][2] += Fk[2];
+
+            if (eflag) evdwl = *triangle_eval;


This is where a call-by-reference method can be much clearer. For consistency with the 2body call I'd make eval still return evdwl by reference, so this eval has to be void (there isn't a natural particular double to return). But you could call it as:

UFBS3b[itype][jtype][ktype].eval(rij, rik, rjk, Fmag_ij, Fmag_ik, Fmag_jk, evdwl)

and voila! Each of your return variables are now named, not numbered!

…ody/3-body is selected

also remove redundant files

akohlmey

This is it for a first round of review. I didn't change most of the unprotected utils::logmesg() calls. In my opinion most should just be removed. They look like debugging output.

One more general comment. It is generally considered a bad idea to read the same file from all the MPI ranks. This can create huge problems with jobs using many MPI processes. The more common procedure is to read everything on rank 0 only and then use MPI communication to broadcast the information to all other ranks.

cmake/CMakeLists.txt

akohlmey · 2024-03-28T21:03:11Z

potentials/A_A_A.uf3_pot

What element does A stand for?

The potential files were modeled after Niobium UF3 potential files. I will replace them with the real files.

doc/src/pair_uf3.rst

src/ML-UF3/pair_uf3.cpp

Updates for ml-uf3 pull request

…ssors was getting some weird errors. Deleted commented (dead) code. Updated the memory_usage function. Reordered some functions to refelect the calling order

Remove std::vector completely

…oeff_array_size, n3b_knots_array_size arrays to 0; fixed memory leaks; removed some dead code

Bug Fixes- uniform knot spacing, memory leaks, array initialization

monk-04 · 2024-05-04T00:55:59Z

@akohlmey Thank you for your suggestion about moving away from std::vectors. It forced me to re-write certain functions. Now arrays are used to store constants of bspline basis set instead of storing uf3_pair_bspline and uf3_triplet_bspline objects (at UFBS2b std::vector<std::vector> and UFBS3b std::vector<std::vector<std::vector>>). Also, I evaluate the energy contribution only when 'eflag' is true instead of every time step.

All these changes led to about 30%-35% speed up for the CPU version.

stanmoore1 · 2024-05-13T18:41:20Z

@monk-04 is this PR ready for review or are more changes coming?

monk-04 · 2024-05-13T19:10:49Z

@stanmoore1 I was waiting for PR #4146 so that the MacOS unit test could pass.
I see that it has been merged and will merge lammps:develop into ml-uf3.

Other than that, I don't have any more changes in mind.

The PR is ready for review.

akohlmey · 2024-05-14T00:39:26Z

@monk-04 I've submitted a pull request with a few minor changes and replaced the example.

Update example with new syntax

monk-04 and others added 10 commits March 25, 2024 12:07

Added ml-uf3 src files

6dded43

Added src files for uf3/kk

53c6959

Added ml-uf3 to cmake and make files

9a23ddf

Added documentation for ml-uf3

923e251

Added ml-uf3 potential files

a720d0d

Added ml-uf3 unittest

5c536c8

Fixed typos in the UF3 equation

d55f750

Fixed lammps developer email-id and contributing authors section. Rem…

de43263

…oved some old comments

Removed LAMMPS errordocs

06c4fc6

fixed trailing whitespaces

b280999

akohlmey self-assigned this Mar 26, 2024

akohlmey added the enhancement label Mar 26, 2024

monk-04 and others added 11 commits March 26, 2024 12:34

Added uf3 details to Commands_pair, Packages_details and Packages_list

4e95db1

Added uf3 examples to the examples directory

3a34b3e

Removed trailing whitespaces

ff39a03

Added example potential files for W

dc0b0d8

add ML-UF3 to compatible CMake preset files

7e09353

update .gitignore

f9a0ec8

update code owners list

0d7c41b

move examples

7323364

reformat and fix spelling and related issues

fda433a

update ML-UF3 examples

713b308

some style cleanup and simplified pair_style and pair_coeff input

fca23da

srtee reviewed Mar 28, 2024

View reviewed changes

akohlmey added 4 commits March 28, 2024 17:01

examples folder was moved

eb89c7a

manybody and single flag need to be changed from the default when 2-b…

392c3b6

…ody/3-body is selected

update more files and docs for .uf3 potential file extension

a6e5c8b

also remove redundant files

some formatting and logic cleanup.

ecb5704

akohlmey requested changes Mar 28, 2024

View reviewed changes

Merge pull request #1 from akohlmey/ml-uf3

a26c281

Updates for ml-uf3 pull request

monk-04 and others added 11 commits May 2, 2024 18:59

Removed uf3_pair_bspline and uf3_triplet_bspline

f02c65e

Commented out the uniform knot spacing logic as for more than 1 proce…

5f2cae0

…ssors was getting some weird errors. Deleted commented (dead) code. Updated the memory_usage function. Reordered some functions to refelect the calling order

Removed unused variables and reordered the code

3340dd5

Fixed trailing whitespace

ab8e4b7

Removed dead code, fixed trailing whitespace

4a8ed1b

Merge pull request #5 from monk-04/ml-uf3

94ab3c2

Remove std::vector completely

Merge branch 'lammps:develop' into ml-uf3

759ce70

Fixed bug with uniform knot spacing

a7f4fc1

Initialize setflag, n2b_coeff_array_size, n2b_knots_array_size, n3b_c…

1cfbc04

…oeff_array_size, n3b_knots_array_size arrays to 0; fixed memory leaks; removed some dead code

Removed trailing whitespace

fe24dde

Merge pull request #6 from monk-04/ml-uf3

2f488cc

Bug Fixes- uniform knot spacing, memory leaks, array initialization

stanmoore1 assigned monk-04 May 13, 2024

Merge branch 'lammps:develop' into ml-uf3

9e755cd

stanmoore1 marked this pull request as ready for review May 13, 2024 19:11

stanmoore1 self-requested a review as a code owner May 13, 2024 19:11

monk-04 requested a review from akohlmey May 13, 2024 19:11

stanmoore1 requested a review from srtee May 13, 2024 19:12

stanmoore1 assigned stanmoore1 and unassigned monk-04 May 13, 2024

akohlmey added 4 commits May 13, 2024 20:16

Merge branch 'develop' into ml-uf3

49f2022

fix spelling

4302d65

remove unused variables

17c0994

update ML-UF3 example

1f52d17

akohlmey requested a review from athomps May 14, 2024 00:41

Merge pull request #7 from akohlmey/ml-uf3

991f4ed

Update example with new syntax

akohlmey approved these changes May 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of pair_style uf3 and uf3/kk #4110

Implementation of pair_style uf3 and uf3/kk #4110

monk-04 commented Mar 25, 2024 •

edited

srtee Mar 27, 2024

srtee Mar 27, 2024

srtee Mar 28, 2024

monk-04 Mar 28, 2024

srtee Mar 28, 2024

monk-04 Mar 28, 2024 •

edited

srtee Mar 27, 2024

monk-04 Mar 28, 2024

srtee Mar 27, 2024

srtee Mar 27, 2024

monk-04 Mar 28, 2024

srtee Mar 28, 2024

akohlmey Mar 28, 2024 •

edited

srtee Mar 27, 2024

akohlmey left a comment

akohlmey Mar 28, 2024

monk-04 Apr 2, 2024 •

edited

monk-04 commented May 4, 2024 •

edited

stanmoore1 commented May 13, 2024

monk-04 commented May 13, 2024 •

edited

akohlmey commented May 14, 2024

Implementation of pair_style uf3 and uf3/kk #4110

Are you sure you want to change the base?

Implementation of pair_style uf3 and uf3/kk #4110

Conversation

monk-04 commented Mar 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monk-04 Mar 28, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akohlmey Mar 28, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akohlmey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monk-04 Apr 2, 2024 • edited

Choose a reason for hiding this comment

monk-04 commented May 4, 2024 • edited

stanmoore1 commented May 13, 2024

monk-04 commented May 13, 2024 • edited

akohlmey commented May 14, 2024

monk-04 commented Mar 25, 2024 •

edited

monk-04 Mar 28, 2024 •

edited

akohlmey Mar 28, 2024 •

edited

monk-04 Apr 2, 2024 •

edited

monk-04 commented May 4, 2024 •

edited

monk-04 commented May 13, 2024 •

edited