Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torsion fitting and generating openmm forces or parmed interactions? #304

Open
ahy3nz opened this issue Jan 6, 2020 · 1 comment
Open

Comments

@ahy3nz
Copy link
Contributor

ahy3nz commented Jan 6, 2020

Describe the behavior you would like added to Foyer
Some force fields utilize non-standard functional forms for bonded interactions; in particular, different torsion types within TraPPE. Non-standard functional forms will be supported in the new backend with the new XML style. But for now, it might be useful to have some routines to not only fit numerical data to functional forms, but also generate appropriate openmm forces that parmed can convert into pmd.Structure (we can't rely on using openmm custom force classes because parmed currently only understands some openmm custom impropers).

Describe the solution you'd like
If I provide some numpy 2D array corresponding to numerical data, fit a functional form to the data (harmonic bond, harmonic angle, periodic torsion, rb torsion, some custom improper), generate the openmm force object for the system, and populate the openmm force object with the participating atoms. Then, calling pmd.openmm.load_topology will have knowledge of these new, custom-fit parameters.

Describe alternatives you've considered
From prior experience, if the numerical data doesn't qualitatively fit the specified functional form will, you will get a terribly-fit function - and that's definitely on the user to realize their fits are bad. Numerical function fitting is straightforward enough for users to fit their own data, extract parameters, and then add to their own XML, so maybe fitting isn't within the realm of foyer. Once a user has fit their data and added the appropriate XML lines, then foyer should parametrize the new interactions (and we don't need any new code in foyer).

Beyond modifying XML files and elsewhere in a foyer workflow, we could create openmm generators and add those generators to the forcefield object, so foyer.forcefield.createSystem will know about these new, non-XML forces.

Or further down the line (within foyer.forcefield.parametrize_system), we could create and modify our own openmm force objects

I think I'm in favor of having users fit their own parameters, but having foyer functions to add openmm generators to the forcefield object. If we consider a single XML line as a single set of parameters for a single type of interaction, a users's fitting procedure also yields a set of parameters for a single type of interaction; both can be processed within a forcefield object. It is this set of parameters that gets registered (like this example for a harmonicbondgenerator that parses the XML line, registers the bond parameters, and saves those bond parameters later for ff.createSystem and force.createForce)

Additional context
In terms of foyer's role, a user could definitely get by with their own numerical fitting and XML-updating. However, to avoid "soft forks" or changes to a canonical XML file, maybe foyer could use some routines to add interactions and parameters that are unspecified within an XML.

In terms of long-term support for function fitting with the new backend, even though the backend will support arbitrary functions and identify lack of function support in particular engines, it could be useful to allow users to re-fit custom functions such that they will be compatible with particular engines. However, this objective should be accomplished without using openmm API (but could be extended to accommodate openmm API), while this issue raised above presents some solutions that are built around the openmm API

@mattwthompson
Copy link
Member

What I see missing in that solution is saving out the result of a fitting procedure. I assume the actual SciPy function call is trivially quick, but generating that fitting on-the-fly produces a huge potential for irreproducibility. How easily these results then be saved back into an XML? Saving the parameters out is almost a non-negotiable for me. I would also prefer the data be saved into an XML by a program, and not by humans. I recognize this may not be straightforward to do.

On this basis I would prefer a fitting step be done not during parametrization of a system but beforehand, using only the target data (we don't need to have a system in memory to do a fitting between a tetrad of atom types).

Something else to consider is how we want to go about storing training data. At the level of a fitting routine, unlabeled floats in arrays is plenty, but imho a lot of care needs to be taken in storing the metadata: units, desired functional forms, what chemistry is it trained against, perhaps other information about the QM done to obtain the data, and probably many other things I haven't thought of.

On where this routine would go, I think here is appropriate now. I'd like to see a more involved force field manipulation library spun off in the future, but I'm not fussed about parking this in a corner of foyer until then.

Please add any thoughts on this discussion @mosdef-hub/mosdef-contributors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants