Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solvent padding description #827

Open
mattwthompson opened this issue Apr 15, 2024 · 3 comments
Open

Solvent padding description #827

mattwthompson opened this issue Apr 15, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@mattwthompson
Copy link
Contributor

Sorry for the long explanation - the actual issue here is incredibly pointed and I want to make sure I'm highlighting only it.

OpenFE's docstring of the padding option in solvation settings is:

"""Minimum distance from any solute atoms to the solvent box edge."""

This setting appears to be passed 1:1 to OpenMM without modification

OpenMM's documentation of a similar argument is a little more verbose

You can give a padding distance. A bounding sphere containing the solute is determined, and the box size is set to (sphere diameter)+(padding). This guarantees no atom in the solute will come closer than the padding distance to any atom of another periodic copy. If the sphere diameter is less than the padding distance, the box size is set to 2*(padding) to ensure no atom is closer than the padding distance to two periodic copies of any other atom.

For some small molecules, however, this produces results as if the bounding sphere representing the solute is size zero, which doesn't intuitively hold up to me. OpenFF has a similar implementation (and documented argument) that goes through Packmol after determining the expected system size, and for similar systems it frequently produces boxes a few Angstroms larger. I'm running small tests with a benzene-in-water system, and OpenFE's code path produces box vectors of 2.4 nm (simply twice the padding value of 1.2) whereas the OpenFF code path produces numbers closer to 2.8 nm (twice the padding value plus a little bit of fuzziness representing the size of the molecule). Maybe for a single atom it makes sense, but for a 3D molecule with some size to it, I don't think the box vectors should be twice the padding value in all three dimensions. Unfortunately this seems to be what OpenMM does (see below).

Both OpenFF and OpenFE code paths produce sensible results (no crashes) as far as I've observed, so I don't think it makes sense to ask for a behavior change. The documentation doesn't communicate that there should be different behavior, so I think that could be updated in some way. I have no reason to believe results would be different, but accidentally packing into a larger-than-needed-box is an easy way to slow down simulations with no benefit ((2.8 / 2.4) ** 3 ~= 1.6, or 60% slower compute!).

Here's a rough demonstration of how the OpenMM code path produces different system sizes with molecules of different size

import openmm
import openmm.app
import openmmforcefields
import openmm.unit

from openff.toolkit import Molecule

from openmmforcefields.generators import GAFFTemplateGenerator


def smiles_to_box_vectors(smiles: str):
    molecule = Molecule.from_smiles(smiles)

    molecule.generate_conformers(n_conformers=1)

    gaff = GAFFTemplateGenerator(molecules=molecule)

    forcefield = openmm.app.ForceField("tip3p.xml")
    forcefield.registerTemplateGenerator(gaff.generator)

    modeller = openmm.app.Modeller(
        molecule.to_topology().to_openmm(),
        molecule.conformers[0].to_openmm(),
    )

    modeller.addSolvent(
        forcefield,
        padding=1.0 * openmm.unit.nanometers,
    )

    a, b, c = modeller.getTopology().getPeriodicBoxVectors()

    return {round(value._value, 3) for value in [a[0], b[1], c[2]]}


for smiles in [
    "N",
    "CCO",
    "c1ccccc1",
    "C12C3C4C1C5C2C3C45",
    "CCCC[N+](CCCC)(CCCC)CCCC",
    20 * "C",
]:
    print(smiles, smiles_to_box_vectors(smiles))

# N {2.0}
# CCO {2.0}
# c1ccccc1 {2.0}
# C12C3C4C1C5C2C3C45 {2.0}
# CCCC[N+](CCCC)(CCCC)CCCC {2.226}
# CCCCCCCCCCCCCCCCCCCC {2.767}

If it would be useful I can

  • Run a MWE of a system setup using OpenFE's API (I have this elsewhere, not stripped down, and the behavior is consistent with
  • Show what I referred to earlier with OpenFF's Packmol code and why the results are a little different
  • Push this up to OpenMM, though updating the documentation there might take longer with slower release cycles
@mikemhenry mikemhenry added the documentation Improvements or additions to documentation label Apr 17, 2024
@mikemhenry
Copy link
Contributor

Thanks for this!

If I understand this correctly, the first thing you are asking is for us to update the documentation. Do you have a suggested doc string? At first I was going to summarize the openmm doc and then include a link to it, but it sounds like even the openmm doc isn't great, so do you have a suggestion? Or, what would you like to see mentioned?

@mattwthompson
Copy link
Contributor Author

Yeah sorry this is all over the place; it'd be much easier if it was an OpenFE or OpenMM documentation issue and not a mixture of both

The best suggestion I've thought of would be a dressed-up version of

"""
Minimum distance from any solute atoms to the solvent box edge.

This argument is passed directly to OpenMM's addSolvent (link), which
typically produces boxes of size twice the padding distance plus the approximate
solute size, but for some small solutes may treat the solute as having zero size.
"""

This would communicate the quirk I'm observing in a way that squares the responsibility with OpenMM but doesn't assign blame. I think paraphrasing the OpenMM docs risks this docstrings falling out of sync each other, whereas just a link wouldn't be as useful to the OpenFE user. (If the behavior changes in OpenMM, this is a different issue altogether). That covers everything I have been thinking of, the most important thing being that it describes what happens

@IAlibay
Copy link
Contributor

IAlibay commented Apr 21, 2024

Thanks @mattwthompson. We could/should add it to this PR: #673, unfortunately it didn't make it into 1.0, but the ability to use the wider set of inputs from addSolvent alongside better docstring would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants