Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper atom order incorrect in LAMMPS data writer for harmonic impropers? #899

Open
rsdefever opened this issue May 5, 2021 · 22 comments
Assignees
Labels
bug discussion high prioirty things that should be resolved asap

Comments

@rsdefever
Copy link
Member

Bug summary

It appears that the atom ordering for harmonic impropers is incorrect in the LAMMPS data writer. Here is the code that writes the impropers. Note that the atoms are written so that the atom1.idx is written in the third position.

            # Dihedral data
            if impropers:
                data.write("\nImpropers\n\n")
                for i, improper in enumerate(impropers):
                    data.write(
                        "{:d}\t{:d}\t{:d}\t{:d}\t{:d}\t{:d}\n".format(
                            i + 1,
                            improper_types[i],
                            improper[2],
                            improper[1],
                            improper[0],
                            improper[3],
                        )
                    )

However, LAMMPS specifies that the central atom should be first see docs here, and parmed stores the central atom first see docs here.

Am I missing something, or are we reordering these atoms when we should not be.

@ahy3nz
Copy link
Contributor

ahy3nz commented May 5, 2021

I can't really remember how I came to that conclusion in an old PR

@rsdefever
Copy link
Member Author

@ahy3nz thanks for hopping in here. I guess we need to make some simple test cases to throw through parmed. The parmed docs are either wrong, or something has changed.

@rsdefever
Copy link
Member Author

Unless someone beats me to the punch, I'm going to create a test case with formic acid. Its a small and simple molecule. I'll make up a fake improper if I have to and I'll create two XMLs; one with a harmonic improper and one with an amber-style improper. Then we can look at the parmed.Structure and also write to LAMMPS, GROMACS, Cassandra, (and others?) and see what happens. 😬 😬 😬

@mattwthompson
Copy link
Member

Impropers are a complete and utter mess. Everybody does it a little differently, everybody adds their own quirks, nobody can agree on the right ordering. I can go on but we're all familiar with the idea.

I don't trust anything short of running an MM single-point energy and isolating just the improper contribution to the overall potential energy. Absolutely tedious but I don't think there's any shortcuts to be had here. LAMMPS and GROMACS make it feasible enough to decompose potential energy contributions (with some file parsing), and I'm assuming Cassandra does as well. CHARMM might, Amber probably does, but OpenMM doesn't.

In fact, OenMM's grouping of propers and impropers has probably led to some confusion in Foyer - are there impropers here? I don't know. ParmEd will probably find some, and plausibly be correct about them, but it's hard to verify them without understanding the source of the data (which I do not).

@rsdefever
Copy link
Member Author

rsdefever commented May 5, 2021

Impropers are a complete and utter mess.

Agreed, but we're stuck with them 👎 .

Here is a gist with an XML file that has an amber-style improper and a python script for formic acid. Please don't take it too seriously. Its a bit of a frankenstein file, but will work for this purpose. Am I being dumb or can we even specify harmonic impropers with foyer?

All I've looked at so far is writing parmed -> gmx. The central atom (c) appears to be written 3rd. Can anyone find a reliable source to say whether that is correct or not for gromacs? I didn't see it mentioned in the reference manual, although I feel like it must be there. Here and here is a start. This thread from the mailing list seems to say central atom should be first?

Update: For formic acid and with amber-style impropers parmed appears to be storing the central atom third in the improper. This is consistent with the parmed docs for amber-style impropers here.

import foyer
import mbuild
import parmed

mol = mbuild.load("C(=O)O", smiles=True)
ff = foyer.Forcefield("improper_amber.xml")

mol_ff = ff.apply(mol)

mol_ff.save("test.gro", overwrite=True)
mol_ff.save("test.top", overwrite=True)

impropers = [d for d in mol_ff.dihedrals if d.improper]
print(impropers[0].atom3.idx) # this atom is index 0, the central atom.

# Just to double check that loaded parmed from file gives the same
# result as using foyer -> parmed
mol_loaded = parmed.load_file("test.top", xyz="test.gro")

impropers = [d for d in mol_loaded.dihedrals if d.improper]
print(impropers[0].atom3.idx) # this atom is index 0, the central atom.

So for amber-style impropers, parmed definitely storing central-atom-third. This is consistent with the docs.

The open question is then for harmonic impropers. The parmed docs say central atom first. I'll update when I learn more.

@rsdefever

This comment has been minimized.

@rsdefever
Copy link
Member Author

rsdefever commented May 6, 2021

Just to add more options for thinking about this...
here is a zip with a minimal working example in LAMMPS. LAMMPS very clearly specifies central atom first. The energy it gives is: 0.0678461708 kJ/mol when the central atom (c, 1-indexed atom 1) is placed first. This more or less agrees with the "flipped" energies from GROMACS.

@rsdefever
Copy link
Member Author

rsdefever commented May 6, 2021

First, apologies if anyone is confused by this thread. I have removed a few earlier comments because I felt that they were unnecessarily complicating the matter.

Harmonic impropers: I am going to compare manually computed energies with LAMMPS and GROMACS.

parmed preserves the harmonic improper order when reading/writing gmx files. For example, if the topology has an improper with the atoms listed in order 1 4 3 2, then the atom1.idx, atom2.idx, etc, in parmed have the same order. You can easily confirm this with the python script attached as part of gromacs_debug.tar.gz.

Next, if we trust the parmed docs, we know the central atom should be first. Then, we can calculate the energy of the improper. We can also flip the first and third atom and re-compute the energy.

Next, we can compute the energies of the original and flipped topologies with gmx. We can compare the energies between our values and theirs.

Finally, we can compute the improper energy in LAMMPS, which clearly defines central atom first.

Summary:
LAMMPS, GROMACS, and the manual calculation show very similar energies when central atom is first. This suggests that the parmed documentation is correct. It also suggests that our LAMMPS writer is incorrect. Please let me know if there is an error in my logic.

gromacs_files.tar.gz
lammps_files.tar.gz

Here are the results from the manual calculation:

(1-indexed) first atom in parmed improper is: 1
Harmonic improper energy is 0.06784616933084593 kJ/mol
(1-indexed) first atom in parmed improper is: 3
Harmonic improper energy is 20.18619305350575 kJ/mol

From GROMACS with this topology:

[ dihedrals ]
;    ai     aj     ak     al funct         c0         c1         c2         c3
      1      4      3      2     2    10.0000000     4.6024000

the energy is: Improper Dih. 0.0723808 -- 0 0 (kJ/mol)

For the flipped topology:

[ dihedrals ]
;    ai     aj     ak     al funct         c0         c1         c2         c3
      3      4      1      2     2    10.0000000     4.6024000

the energy is: Improper Dih. 20.1864 -- 0 0 (kJ/mol).

The energy from lammps is: 0.0678461 for the following impropers:

Impropers

1       1       1       4       3       2

@rsdefever
Copy link
Member Author

@ahy3nz I agree with most of the observations in your comment. However, I disagree with:

Parmed Structure: A parametrized Structure will list the central atom as the third atom for ALL impropers.

As far as I can tell, the CHARMM-style (harmonic) impropers are stored central-atom-first. I make this statement based upon reading/writing GROMACS files, the documentation, and a comparison of the computed energies between by-hand, LAMMPS, and GROMACS.

@rmatsum836
Copy link
Contributor

rmatsum836 commented May 6, 2021

Similar test case as above but for Amber improper dihedrals. I don't get agreement in improper energies between gromacs and lammps unless I flip the 1st and 3rd atom of the improper in gromacs (flipped.top).

emim_improper.tar.gz

@rmatsum836
Copy link
Contributor

I updated the above example. There is now a test_periodic.py script that builds a system from scratch with mosdef and uses the resulting parmed structure to calculate the improper energies. Then a comparison is made with the top files.

@ahy3nz
Copy link
Contributor

ahy3nz commented May 6, 2021

Sounds like the lammpswriter probably needs to update the ordering then. In hindsight, I'm really curious what I had seen 3 years ago that made me incorrectly think the parmed convention was central-atom-third.

@lilyminium
Copy link

@ahy3nz in 2019 I came to the conclusion that parmed had the central atom first for harmonic impropers but third for periodic impropers. I didn't link to any part of the code or docs though :/

@rsdefever
Copy link
Member Author

rsdefever commented May 6, 2021

@ahy3nz I'm pretty convinced for the harmonic impropers.

I'm still struggling to figure out the amber-style impropers. GROMACS energies match with by-hand calculations (shout out to @rmatsum836) if you use central-atom-first. But the question still remains, does amber specify central-atom-third. And I lean towards yes on that. The main reason is this: for amber-style impropers, you should be able to use the same functional form as proper dihedrals, but just drop in the central atom as the third atom. If you do that in gmx and compute the energies with funct 1 (periodic) vs. funct 4 (periodic improper) the energies are the same with the same atom ordering.

@rmatsum836
Copy link
Contributor

@lilyminium is correct. ParmEd correctly maintains the order of Amber impropers (central-atom third) so we should be able to keep the atom order from ParmEd dihedrals when writing out to LAMMPS.

@ahy3nz
Copy link
Contributor

ahy3nz commented May 6, 2021

I think this foyer test might be fishy then. The central atom is well-defined but the resultant parmed improper puts that central atom as the 3rd element in the pmd.improper object?

@rsdefever
Copy link
Member Author

I think this foyer test might be fishy then. The central atom is well-defined but the resultant parmed improper puts that central atom as the 3rd element in the pmd.improper object?

Thanks for bringing this to my attention. This is why I opened up the thread rather than just dropping in a fix immediately. I'll let you know what I find.

@rsdefever
Copy link
Member Author

rsdefever commented May 7, 2021

@ahy3nz I took a look at the test. I agree--central atom definitely ends up third in the parmed.Structure. Is something going wrong in the openmm -> parmed conversion then?

What is interesting is that the central atom (_CL) is listed first in the XML, which I think correctly follows the OpenMM spec for impropers. But then something gets switched around by the time it gets to parmed. The conversion doesn't seem to do any re-ordering.

@ahy3nz
Copy link
Contributor

ahy3nz commented May 7, 2021

It appears the atom ordering switching happens when foyer creates the openmm system, BEFORE any conversion-to-parmed happens. Specifically, when creating the customtorsionforce object via the customtorsionforcegenerator, there is some interesting improper-matching logic implemented in openmm.

Consequently, the order of the atoms in the improper changes depending on the presence of wildcards.

At the very least, I think this atom-ordering-improper issue stems in how openmm creates the customtorsionforces, and our writers try to be consistent with those, but this improper-with-wildcards is an interesting situation. Maybe the a test similar to test_charmm_improper should be done with no wildcards in the custom torsion xml

@rsdefever
Copy link
Member Author

Specifically, when creating the customtorsionforce object via the customtorsionforcegenerator, there is some interesting improper-matching logic implemented in openmm.

I may be misinterpreting the code, but it looks like for charmm the ordering is central-atom-third if there are wildcards but central atom first if not? That seems very odd, but I may be misinterpreting the code. amber is always central-atom-third, as you would expect.

@ahy3nz
Copy link
Contributor

ahy3nz commented May 10, 2021

That's what I had gathered from the code, central-atom changes if there are wildcards. I don't know why, and I don't know how that logic was determined.

If this openmm logic is the "right" way to treat these customtorsionforce elements in an XML, then this might mean one of two things?

  1. The charmm_cooh.xml I used is actually a malformed XML element with a wildcard, and a correctly-written XML element will be processed by openmm to yield an improper that always puts central atom first?

  2. If the charm_cooh.xml is correctly written, then the resultant writers will need to adjust the atom ordering of the parmed improper objects based on XML element (wildcard or not)?

@CalCraven
Copy link
Contributor

We need to resolve this issue (only a year later). This is a pretty important one to anyone using these systems. I'm marking this as high priority, and bringing it up at the next developers meeting.

@CalCraven CalCraven added the high prioirty things that should be resolved asap label Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug discussion high prioirty things that should be resolved asap
Projects
None yet
Development

No branches or pull requests

8 participants