Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve pdb_merge to include TERs and END lines #150

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

joaomcteixeira
Copy link
Member

Closes #149

Copy link
Member

@amjjbonvin amjjbonvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But are END lines first filtered out? There should not be multiple END statements

@joaomcteixeira
Copy link
Member Author

That's right. However, I feel we are adding too much to the purpose of merge. All that should be done by tidy. Maybe if we exchange the -strict option for tidy we don't need to touch the merge. What do you think? Because the merge was really designed to be a concatenator

@amjjbonvin
Copy link
Member

amjjbonvin commented Nov 8, 2022 via email

@joaomcteixeira
Copy link
Member Author

The current version of pdb_merge does not remove any lines from the input files. It's a straight concatenation.

for fhandle in flist:
for line in fhandle:
yield line
fhandle.close()

@amjjbonvin
Copy link
Member

amjjbonvin commented Nov 8, 2022 via email

@joaomcteixeira
Copy link
Member Author

Yes. mkensemble has the clear purpose of making a correct ensemble of structures. While merge assumes the user knows what he/she is doing when concatenating the files. Likely the user cleaned the PDBs before using merge.

@amjjbonvin
Copy link
Member

amjjbonvin commented Nov 8, 2022 via email

@joaomcteixeira
Copy link
Member Author

Good point. Let's first clarify that to users before changing the behavior of pdb_merge.

@joaomcteixeira
Copy link
Member Author

I was reviewing this now, and I still agree we should not touch pdb_merge for this purpose and pipe the results to pdb_tidy as the logic is not that straightforward. It is stated in the docs.

The contents are not sorted and no lines are deleted (e.g. END, TER
statements) so we recommend piping the results through `pdb_tidy.py`.

@amjjbonvin
Copy link
Member

How will you explain users that pdb_merge does require pdb_tidy to make a correct PDB while pdb_mkensemble not...

@joaomcteixeira
Copy link
Member Author

You are right. I am addressing this. It's not as straightforward as adding an END because also TER was not accounted for. If TERs exist in the output is because they were already present in the input. I am working on it.

@joaomcteixeira joaomcteixeira marked this pull request as draft April 3, 2023 22:13
@joaomcteixeira joaomcteixeira changed the title add END line as final line in pdb_merge Improve pdb_merge to include TERs and END lines Apr 3, 2023
@joaomcteixeira
Copy link
Member Author

When doing pdb_merge, should atom numbers be renumbered starting from 1? This would change the original atom numbers but avoid repeated numbers.

@amjjbonvin
Copy link
Member

amjjbonvin commented Apr 6, 2023 via email

@joaomcteixeira joaomcteixeira marked this pull request as ready for review April 11, 2023 14:14
@joaomcteixeira
Copy link
Member Author

Hi @amjjbonvin

I have addressed this issue, plus some other details. Now, pdb_merge operates as follows:

  • The merged PDB file will represent a single MODEL.
  • Comment lines in input PDBs will be ignored (REMARK, ...)
  • Atom numbers are restarted from 1.
  • CONECT lines are yield at the end. CONECT numbers are updated to
    the new atom numbers.
  • Missing TER and END statements are placed accordingly. Original
    TER and END statements are maintained.

Inside tests/data/ there are three PDBs, dummy_merge_A/B/C.pdb that you can use to test. Test also with others you may know, and let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pdb_merge not adding END statement
2 participants