Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert itp files to ff files #327

Open
wants to merge 86 commits into
base: master
Choose a base branch
from
Open

Convert itp files to ff files #327

wants to merge 86 commits into from

Conversation

fgrunewald
Copy link
Member

@fgrunewald fgrunewald commented Jun 16, 2023

The new utility allows users to extract ff-files automatically from itp files by defining fragments as SMILES. Currently the code only works for linear polymers. Let's look at some examples for how to use this Program.

Say we have a itp that describes any number of PEO oligomers then to extract the parameters for PEO we'd run:

polyply itp_to_ff -i <itp_file.ipt> -sm [CH2]O[CH2] -rn PEO -o new.ff -charge 0

Now of course with PEO there is the challenge of how we treat the termini. They can for example be terminated with an OH group. By default the algorithm will group all atoms that cannot be assigned to a fragment as given by the smile into the next connected fragment. Thus it creates a new bigger fragment. In this example, we would obtain the PEO fragment but also PEOter fragment that describes the termini. If they are asymmetric the code produces 2 new residues describing the termini.

Of course this might not be what we want, since we may want to build a modular ff format. In this case we can also specify the OH terminal explicitly as a fragment. For example, like so:

polyply itp_to_ff -i <itp_file.ipt> -sm [OH][CH2] [CH2]O[CH2] [OH][CH2] -rn OH PEO OH -o new.ff -charge 0

Note that we need to provide the terminal definitions on both ends, because the code works in blocks. Using this command we obtain definition for the termini as a separate block.

@ricalessandri

To Do

  • Remove print and draw statements
  • sort the atoms in a block definition according to their name
  • deal with chirality/tacticity
  • improve the charge equalisation; also round charges so they are at most 2 significant digits
  • test on charged molecules
  • make pysmiles optional dep. and raise Error if not installed
  • How many repeat units are needed???
  • add a warning that extract links only work with linear molecules atm
  • add help text to CLI

Known Issues

  • sometimes the first match picks a repeat unit that is not at the start of the chain. In that case, all other matches are likely messed up. The workaround is to provide the first smile with all the correct hydrogen atoms
  • if there are too few repeat units to extract all representative interactions between monomers the small fragment will work correctly but the larger ones will be incorrect
  • only works with all-atom at the moment

@ricalessandri ricalessandri added the enhancement New feature or request label Jun 24, 2023
@fgrunewald fgrunewald mentioned this pull request Aug 12, 2023
2 tasks
@fgrunewald
Copy link
Member Author

@ricalessandri that's it; except for the improvements on labeling edges the code is done and has all tests required. I ran it on my CHARMM database and will try some OPLS tomorrow. Please give it a try and see if there are any problems.

Copy link
Member

@pckroon pckroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this can be super useful!
I forsee some issues with the automatic handling of the termini, but it is a hard problem. You make a Link to deal with those, rather than a Modification? What's the reasoning here?

I'll finish this review after the BigSmiles PR is merged, and this branch has been updated.

parser_itp_ff.add_argument('-o', dest="outpath", type=Path)
parser_itp_ff.add_argument('-c', dest="charges", type=float, nargs='*')
parser_itp_ff.add_argument('-c', dest="res_charges", nargs='+', type=lambda s: s.split(':'),)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a help with the format/syntax

Comment on lines +244 to +245
parser_itp_ff.add_argument('-f', dest='inpath', type=Path, required=False, default=[],
help='Input file (ITP|FF)', nargs='*')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/library/argparse.html#argparse.FileType

Maybe also in other places. Optional though

self.resid += 1

def label_fragments_from_graph(self, fragment_graphs):
def extract_unique_fragments(self, reference_graph):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the docstring still needs updating

Comment on lines +151 to +152
# finally we simply collect one graph per restype
# which are the most centrail (i.e. avoid ends)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the "why"

# which are the most centrail (i.e. avoid ends)
unique_fragments = {}
frag_centrality = {}
centrality = nx.betweenness_centrality(self.res_graph)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want the most central nodes, see also https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.distance_measures.center.html#center
You can experiment a little on what gives the best results.

for node in target_block.nodes:
target_attrs = target_block.nodes[node]
ref_attrs = ref_block.nodes[target_attrs['atomname']]
for attr in ['atype', 'mass']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why limit to these attrs?

Comment on lines +296 to +297
if target_atoms == ref_inter.atoms and\
target_inter.parameters != ref_inter.parameters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if target_atoms == ref_inter.atoms and\
target_inter.parameters != ref_inter.parameters:
if target_atoms == ref_inter.atoms and target_inter.parameters != ref_inter.parameters:

mol_atoms_to_link_atoms, edges, resnames = _extract_edges_from_shortest_path(target_inter.atoms,
molecule,
min(resids))
#link_to_mol_atoms = {value:key for key, value in mol_atoms_to_link_atoms.items()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#link_to_mol_atoms = {value:key for key, value in mol_atoms_to_link_atoms.items()}

Comment on lines +303 to +305
link_inter = Interaction(atoms=link_atoms,
parameters=target_inter.parameters,
meta={})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
link_inter = Interaction(atoms=link_atoms,
parameters=target_inter.parameters,
meta={})
link_inter = Interaction(atoms=link_atoms,
parameters=target_inter.parameters,
meta={})

molecule,
min(resids))
link_atoms = mol_to_link.values()
link = vermouth.molecule.Link()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
link = vermouth.molecule.Link()

Copy link
Member

@pckroon pckroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be continued after #358

Comment on lines +289 to +292
# a little dangerous but mostly ok; if there are no changes to
# the atoms we can continue
if len(replace_dict) == 0:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dangerous why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants