Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable EMST in all colvars using makeWhole() #1068

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

GiovanniBussi
Copy link
Member

Description

This is changing how PBCs are dealt with in a lot of CVs, so please @maxbonomi @carlocamilloni @gtribello @Iximiel have a look if it makes sense

Current state

Currently, most CVs handle PBCs automatically to some extent. In many cases, the CV definition is not depending on periodic boundaries per se, but just requires molecules to be properly reconstructed before the CV is computed. This is achieved by calling the function makeWhole. This function makes a local copy of the coordinates whole by computing the distance between consecutive atoms in the list using PBCs.

WHOLEMOLECULES is someway special, because in addition to this can use a minimum spanning tree computed on the reference structure provided by MOLINFO (see #681). In practice:

MOLINFO STRUCTURE=ref.pdb WHOLE
WHOLEMOLECULES ENTITY0=1-123 EMST

will reconstruct PBCs using the spanning tree. Here we specify the information twice:

  1. We note that the ref.pdb is WHOLE.
  2. We tell to WHOLEMOLECULE to use those coordinates as a reference

Proposed change

I here implemented to possibility to use the spanning tree also in local reconstructions (e.g., in RMSD, GYRATION, etc). The logic is the following: for a given CV, if the last appearing MOLINFO in the input file is marked as WHOLE we use the spanning tree. Otherwise we use the ordered reconstruction. With NOPBC, PBCs are always ignored.

For example:

# no molfile, so we use the standard (ordered) reconstruction
r1: RMSD REFERENCE=ref.pdb
# PBC are not used:
r2: RMSD REFERENCE=ref.pdb NOPBC

MOLINFO STRUCTURE=ref.pdb WHOLE
# use the minimum spanning tree
r3: RMSD REFERENCE=ref.pdb
# PBC are not used, same as r2
r4: RMSD REFERENCE=ref.pdb NOPBC

MOLINFO STRUCTURE=ref.pdb # notice: no WHOLE here
# latest molfile is not whole, so we use the standard (ordered) reconstruction
# same as r1
r5: RMSD REFERENCE=ref.pdb
# PBC are not used, same as r2 and r4
r6: RMSD REFERENCE=ref.pdb NOPBC

For consistency, I did the same to WHOLEMOLECULES. This means that:

  • EMST flag in WHOLEMOLECULES is not needed anymore: if the previous MOLINFO is WHOLE, the tree will be used anyway.
  • EMST flag is allowed (for backward compatibility) and triggers an error if it's used when the previous MOLINFO is not WHOLE, as a sanity check.

Closed issue

This PR addresses points 1 and 4 in #681. Given that point 2 is closed and point 3 was optional (and likely not a good idea, because it would be extremely expensive), I think this closes #681 .

Implementation

This required me to rewrite the Tree class so as to manage a representation of the tree where we store indexes rather than AtomNumbers. In pratice, I construct the indexes list, then I use it to construct the AtomNumbers list. This is only done during initialization.

Notice that a separate tree is build for each CV depending on the used atoms. So, in order to compute the end-to-end distance in a polymer it is necessary to use WHOLEMOLECULES!

Target release

I would like my code to appear in release v2.10

@GiovanniBussi GiovanniBussi changed the title Enable EMST in all colvas using makeWhole() Enable EMST in all colvars using makeWhole() May 3, 2024
@carlocamilloni
Copy link
Member

i think is a very useful feature

@maxbonomi
Copy link
Member

I agree!

Comment on lines +441 to +442
const auto & tree_indexes=tree->getTreeIndexes();
const auto & root_indexes=tree->getRootIndexes();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have read the Tree class wrong:
It appears to me that Tree::getTreeIndexes(); and Tree::getRootIndexes(); are getters that return a copy to the respective internal variable in the tree.

It seems to me that that const auto& creates a reference to the copy that is returned by the getter
I think this example shows better what I am saying.

It may be a good idea to the getters return a const reference, so you will have a read-only accessor without any copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I've already seen this (the original version was returning by copy as well) and I agree it should be fixed. Thanks!

@GiovanniBussi
Copy link
Member Author

Regarding this:

Given that point 2 is closed and point 3 was optional (and likely not a good idea, because it would be extremely expensive)

maybe it is worth to think about the on-the-fly version. This would require to rewrite the EMST in a more general version that:

  • is optimized when we can assume that all atoms are present (indexes would be just consecutive numbers)
  • can use custom functions to compute distance (e.g. with PBC).

Syntax could be something like:

# no need to have a whole reference structure: the structure is made whole on the fly!

# do it every time the action is enabled, here it's every 1000 steps
DUMPATOMS ATOMS=1-1000 DYNAMIC_EMST=YES STRIDE=1000

# do it every fixed pace (here 1000), assuming it's constant in the intermediate steps
# very much like a neighbor list
DUMPATOMS ATOMS=1-1000 DYNAMIC_EMST=1000 STRIDE=10

And DYNAMIC_EMST would be a kind of ubiquitous keyword, similar to NOPBC. We might choose a better name for it

I think this can be super useful for visualization, so I want to give it a try. In the end, this is order N^2 calculation, which is the same that happens for Coordination numbers when we do the neighbor list update, so it is not completely crazy to use it "as a neighbor list".

@GiovanniBussi GiovanniBussi added the wip Do not merge label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wip Do not merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Euclidean minimum span tree for automatic wholemolecules
5 participants