Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foyer should identify missing parameters when it fails #323

Open
jpotoff opened this issue Mar 2, 2020 · 18 comments
Open

Foyer should identify missing parameters when it fails #323

jpotoff opened this issue Mar 2, 2020 · 18 comments
Labels

Comments

@jpotoff
Copy link

jpotoff commented Mar 2, 2020

When Foyer fails to assign parameters (especially for dihedrals), it should identify the parameter types that it could not find in the XML file.

example.zip

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

Replaced the vapor_box_param assignment with this, but did not get particularly useful output.
vapor_box_param=ff.apply(vapor_box,verbose='True')

UnboundLocalError Traceback (most recent call last)
in
2
3 ff=foyer.Forcefield(forcefield_files='opls_gomc.xml')
----> 4 vapor_box_param=ff.apply(vapor_box,verbose='True')
5 #liquid_box_param=ff.apply(liquid_box)
6 par_writer.write_par(vapor_box_param,'mbuild.par')

~/anaconda3/lib/python3.7/site-packages/foyer/forcefield.py in apply(self, topology, references_file, use_residue_map, assert_bond_params, assert_angle_params, assert_dihedral_params, assert_improper_params, combining_rule, verbose, *args, **kwargs)
499 assert_angle_params=assert_angle_params, assert_dihedral_params=assert_dihedral_params,
500 assert_improper_params=assert_improper_params, combining_rule=combining_rule,
--> 501 verbose=verbose, *args, **kwargs)
502
503 def run_atomtyping(self, topology, use_residue_map=True):

~/anaconda3/lib/python3.7/site-packages/foyer/forcefield.py in parametrize_system(self, topology, positions, references_file, assert_bond_params, assert_angle_params, assert_dihedral_params, assert_improper_params, combining_rule, verbose, *args, **kwargs)
567 _check_angles(data, structure, verbose, assert_angle_params)
568 _check_dihedrals(data, structure, verbose,
--> 569 assert_dihedral_params, assert_improper_params)
570
571 if references_file:

~/anaconda3/lib/python3.7/site-packages/foyer/forcefield.py in _check_dihedrals(data, structure, verbose, assert_dihedral_params, assert_improper_params)
297 missing_dihedral = False
298 if missing_dihedral:
--> 299 print('missing improper with ids {}'.format(pmd_ids))
300
301 if data.propers and len(data.propers) != \

UnboundLocalError: local variable 'pmd_ids' referenced before assignment

@rmatsum836
Copy link
Contributor

rmatsum836 commented Mar 2, 2020

Thanks for the suggestion, this was also suggested in #169, and (at least) partially addressed in #186. Also partially addressed in #293 .

@mattwthompson
Copy link
Member

Try vapor_box_param=ff.apply(vapor_box, verbose=True) is indeed what should address this issue (although that triggers a bug)

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

Try vapor_box_param=ff.apply(vapor_box, verbose=True) is indeed what should address this issue (although that triggers a bug)

Still getting the same output related to "UnboundLocalError"

@rmatsum836
Copy link
Contributor

Matt shared your XML and I think I found the issue. In this line:

<Proper class1="HC" class2="CT" class3="CT" class4="HC" periodicity1="3" phase2="0" k1="0.6276"/><Proper class1="HC" class2="CT" class3="CT" class4="HC" periodicity1="3" phase2="0" k1="0.6276"/>

I think there is a typo at phase2. Changing this to phase1, allows me to atomtype an ethane molecule.

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

I fixed the the "phase" problem, but the end result is the same for n-butane.

@rsdefever
Copy link
Member

rsdefever commented Mar 2, 2020

@jpotoff What version of foyer are you using? And how did you install? conda?

I am able to type a butane in GAFF (periodic dihedrals) with foyer 0.7.3 on my laptop.

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

@rsdefever where is your GAFF xml hiding? I saw it recently, but can't seem to find it.

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

I am running foyer 0.7.3

@mattwthompson
Copy link
Member

Using the original XML and this snippet (modify the SMILES string):

ff=foyer.Forcefield(forcefield_files='/Users/mwt/Downloads/example/opls_gomc.xml')

ff.apply(mb.load('CCCC', smiles=True))

ethane and propane can be typed, but butane cannot be
Using this stripped-down XML file, we can type butane. I believe the root cause is including the other terms in the dihedral, although I'm not sure why. The OpenMM docs imply this is fine

@rmatsum836
Copy link
Contributor

The issue is when we check the dihedrals, as len(structure.dihedrals != len(proper_dihedrals).

In this case, the structure identifies 27 dihedrals (I think this is correct). In this case, the C-C-C-C has 3 dihedrals for each periodicity listed in the XML (I think this is fine but I'm not sure). Because of these 3 dihedrals, there are a total of 29 proper_dihedrals in the system. I think this is a bug (I can raise this as an issue), but for now passing assert_dihedral_params=False will allow butane to be typed.

@rsdefever
Copy link
Member

I don't think thats the issue that @jpotoff is running into. I tried assert_dihedral_params=False with @jpotoff earlier and it didn't fix the issue. Something funkier is going on.

For me with GAFF, yes, I need to use assert_dihedral_params=False for it to work.

GAFF XML here: https://github.com/rsdefever/antefoyer

If you just want the XML file: https://github.com/rsdefever/antefoyer/blob/master/antefoyer/xml/gaff.xml

@rmatsum836
Copy link
Contributor

To make sure we're on the same page, this is the xml file I'm currently using: https://gist.github.com/rmatsum836/9e6ceee6cddc5ac059453ee3ce368bd4

And this is the script I'm using: https://gist.github.com/rmatsum836/94070e302d40cde8974de4697b0accfb

@jpotoff
Copy link
Author

jpotoff commented Mar 2, 2020

Ok, so loaded up gaff.xml using
ff=foyer.Forcefield(forcefield_files='gaff.xml')
vapor_box_param=ff.apply(vapor_box, assert_dihedral_params=False, verbose=True)

And I'm still getting the same error...

@ahy3nz
Copy link
Contributor

ahy3nz commented Mar 3, 2020

Matt shared your XML and I think I found the issue. In this line:

<Proper class1="HC" class2="CT" class3="CT" class4="HC" periodicity1="3" phase2="0" k1="0.6276"/><Proper class1="HC" class2="CT" class3="CT" class4="HC" periodicity1="3" phase2="0" k1="0.6276"/>

I think there is a typo at phase2. Changing this to phase1, allows me to atomtype an ethane molecule.

Make the above change to your xml

ff=foyer.Forcefield(forcefield_files='gaff.xml')
vapor_box_param=ff.apply(vapor_box, assert_dihedral_params=False, verbose=False)

assert_dihedral_params=False, verbose=False. If you have both those flags set, then GAFF-foyer and your opls_gomc.xml will atom-type and parametrize properly. Two things are happening here:

  1. Because of the layered nature of the periodic dihedrals, the number of dihedral bonded tuples will probably never match the number of dihedral types, and that's OK for periodic torsions. The problem more lies with how foyer checks dihedrals. Because of this assert_diheral_params=False

  2. verbose=False Again the problem lies with how foyer checks dihedrals, and where some of these terms get stored in the parmed.Structure. Avoid this problem by setting verbose=False

In a perfect world, the dihedral checking would do a better job looking at periodic torsions and rb torsions, and it would also do a better job checking that every dihedral tuple has a dihedraltype, not just the numbers of dihedral types match the number of dihedral tuples

@jpotoff
Copy link
Author

jpotoff commented Mar 3, 2020

Using these commands I was able to get foyer to output parameters:
ff=foyer.Forcefield(forcefield_files='gaff.xml')
vapor_box_param=ff.apply(vapor_box, assert_dihedral_params=False, verbose=False)

Using my opls_gomc.xml file also worked.

But...this behavior from foyer needs to be fixed. Turning on a "verbose" flag to get more information should never break the code.

Simply checking that the number dihedral types matches the number of dihedral bonded tuples is not the best practice as it only works as intended for parameter files that use RB torsions.

Furthermore, the default behavior of foyer does not follow what is given in the documentation:

https://mosdef.org/foyer/atom-typing_options.html

assert_dihedral_params : bool, optional, default=True

If True, Foyer will exit if parameters are not found for all system proper dihedrals.

In this case, all dihedrals were defined, but foyer exited with an error (i.e. the opposite behavior is produced). Dihedral checking had to be turned off to write the parameter file (via assert_dihedral_params=False). I see @rsdefever raised this in issue #295.

Running with assert_dihedral_params=False is, quite frankly, dangerous. I could delete some of the torsions from the XML parameter file and foyer was happy to produce a parameter file minus the torsions I has deleted. Only a "warning" is produced instead of failing with an error. At no point does foyer ever tell me what torsions are missing.

Overall, this is not appropriate behavior for a tool that one expects will be used by non-experts. Two things are going to happen. 1) users will get frustrated and give up, or 2) they disable safety checks just to get foyer to produce some kind of output, and run bad simulations. Fortunately, those missing torsions will be caught by codes like NAMD, GOMC, GROMACS, etc, but other codes may not.

@rsdefever
Copy link
Member

There are several things happening here, so I am going to try to break them down into distinct issues that we can address (in order of easiest to hardest, I think).

  1. The docstring for assert_dihedral_params should be updated to more accurately describe the current behavior.

  2. Fix verbose=True so that turning it on never breaks the code. Even if we are unable to output the full verbose results, we should tell the user than rather than erroring out. This seems to be a bug that needs to be addressed.

  3. Fix the behavior for assert_dihedral_params so it behaves the way we want it to. Right now one of the challenges is that if the force constant for a periodic dihedral is 0.0, it never gets parameterized: see comment from assert_dihedral_params incorrect behavior for layered dihedrals (e.g., GAFF) #295:

It appears that if the force constant for a dihedral is 0.0, then foyer (really openmm) doesn't add the dihedral to the parameterized list. An example of this is GAFF where there is a dihedral defined as: class1="" class2="c3" class3="ca" class4="". The force constant is 0.0. When I run a relevant molecule through foyer, it add no parameters for a ca-c3-ca-ca dihedral.

Unfortunately, this means that the number of parameterized dihedrals will sometimes be less than the number of physical dihedrals.

Which, is fine from the perspective of defining the potential, but challenging for us because it is hard to know that the dihedral was found, but had a force constant of 0.0. This may be something that is easier to address once we move over to GMSO for applying forcefield parameters. @jpotoff I agree that this is a problem. It makes it difficult to be sure that all of the dihedral were indeed parameterized.

If others agree with this breakdown and commentary, I would advocate for closing this issue and opening new discrete issues for each of these so we can tackle them.

@rmatsum836
Copy link
Contributor

  1. Fix verbose=True so that turning it on never breaks the code. Even if we are unable to output the full verbose results, we should tell the user than rather than erroring out. This seems to be a bug that needs to be addressed.

I agree, and is something I can address in a PR. The fix won't be perfect, but we can at least prevent verbose=True from erroring out.

this may be something that is easier to address once we move over to GMSO for applying forcefield parameters

I think checking dihedrals will become easier once we replace Parmed with GMSO. A big headache right now is that RB dihedrals and periodic dihedrals are stored as separate attributes in a parmed structure. In GMSO we will keep track of all proper dihedrals in a single attribute which should make the book keeping easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants