Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange RMSD values #105

Open
livaschar opened this issue Feb 8, 2024 · 5 comments
Open

Strange RMSD values #105

livaschar opened this issue Feb 8, 2024 · 5 comments
Assignees

Comments

@livaschar
Copy link

I calculate the RMSD value for two xyz files (opt.xyz and sp.xyz) with all possible methods, and the smallest value I take for a specific example is 2.1.
Then I run ''calculate_rmsd -p opt.xyz sp.xyz'' and take the modified sp coordinates (sp_modified.xyz). When calculating the rmsd between opt.xyz and sp_modified.xyz, I take a value of 0.4.

Why is that? Shouldn't the calculation produce the same results?

xyz.zip

@charnley
Copy link
Owner

charnley commented Feb 9, 2024

Hi @livaschar , this is unwanted but expected behavior. It is because of the way the reordering is working. Trying to match atoms with atoms can be very approximate. Especially when the atom environments are very similar.

For example, if you try

calculate_rmsd --reorder --reorder-method hungarian sp.xyz sp_modified.xyz # 1.98
calculate_rmsd --reorder --reorder-method distance sp.xyz sp_modified.xyz # zero
calculate_rmsd --reorder --reorder-method hungarian sp.xyz sp.xyz # zero

you will see when you try to reorder, it is based on distance between atom pairs, and might not give the correct match.

Unfortunately, the Hungarian algorithm requires that the structures are already aligned to be a good fit. Which in your case happens after your initial alignment, which is why you see a good fit here.

calculate_rmsd --reorder opt.xyz sp_modified.xyz # 0.47

There is a missing --reorder-method based on a atom descriptor model called --reorder-method qml which might solve your issue and I am working on getting that python package up again.

But in essence your problem is you are trying to match your atoms on a very symmetric molecule, which is super hard to do. You might want to checkout rdkit for a chem-informatics approach, as this package does not use bonds to align.

@charnley charnley self-assigned this Feb 9, 2024
@livaschar
Copy link
Author

Thanks for your quick reply, @charnley! The thing that I don't fully get is what the coordinates of the 'calculate_rmsd -p' command are. Which method is used to reorder the molecule? Hungarian, distance, etc., or is it something else and I got it wrong?

I will appreciate it if you notify us when qml is up and running. Thank you in advance.

@nbehrnd
Copy link
Contributor

nbehrnd commented Feb 12, 2024

@livaschar The default reorder algorithm is the Hungarian one, see for instance README.rst. Flag -p and the more verbose --print report the coordinates of the then already aligned structure B. I don't know if one can access the atomic coordinates after the resort yet prior the alignment because so far this wasn't my interest while using the script.

@livaschar
Copy link
Author

@nbehrnd so if the coordinates printed by '-p' flag are the ones coming from the Hungarian algorithm, shouldn't the two cases

calculate_rmsd --reorder --reorder-method hungarian opt.xyz sp.xyz
calculate_rmsd --reorder --reorder-method hungarian opt.xyz sp_modified.xyz

produce the same results, since the 'sp_modified.xyz' is just the final modified version of the 'sp.xyz'?

@nbehrnd
Copy link
Contributor

nbehrnd commented Feb 12, 2024

@livaschar If I compare the structures sp.xyz and sp_modified.xyz as such outside the script with Jmol at the level of .xyz files -- only the atomic positions -- the two structures can be superimposed nicely. For Jmol's tolerances and eventual metric, the RMSD drops from initial 4.68 to 0.0. This supports your argument the results of a comparison of opt.xyz vs sp.xyz and of a comparison opt.xyz vs sp_modified.xzy should yield very similar (if not identical, ideally) RMSD.

As mentioned by @charnley, the restrain to use atomic positions (only) to describe a structure and highly symmetrical structures to compare with each other show a limitation of an algorithm / the implementation of an algorithm. For me, calculate_rmsd.py remains a valuable tool (or: how often does one compare two structures in C2 symmetry, or higher?).

2024-02-12_xyz.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants