New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding hydrogens with -p messes up the entire structure (not only residue numbers and names) #2677
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
@kalinni In an instance of Linux Debian 13/trixie with openbabel 3.1.1., I run the addition of hydrogens by $ obabel 3lcs.pdb -h -O obabel_with_hydrogens.pdb
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 3lcs.pdb)
1 molecule converted Based on this, perhaps some ligands, or chains are a bit deformed to "fit" into Open Babel's patterns to recognize them well. This deformation could be by mere distance, or/and geometry. I then run grep, but didn't find any $ grep UNL ./obabel_with_hydrogens.pdb -c
0
$ grep UNK ./obabel_with_hydrogens.pdb -c
0 An alternative approach could be to use Jmol, to load the initial .pdb, and (File -> Console) to enter the following commands there (confirmed by enter):
for a new .pdb file. This equally didn't have a $ grep UNK Jmol_with_hydrogens.pdb -c
0
$ grep UNL Jmol_with_hydrogens.pdb -c
0 but a different processing program might provide a .pdb different enough to continue the work. (Contrasting to the input, lattice parameters are missing, though.) |
Thank you so much for the quick response! The issue occurs with -p though, with -h the file is fine but the hydrogens are not what I expect. |
To follow up a little - from what I understand, in both cases (using While with As mentioned, my actual use case is within a Python tool (PLIP), so I am looking for a solution that does not require additional programs, as I would like to avoid adding further dependencies. |
I noticed I am getting the following warning when protonating with the
Maybe that could be related? Not sure what's wrong there, the file is present in the same directory as the phmodel.txt and definitely not missing. |
In absence of working knowledge of C++, I lack insight how Openbabel's C++ code is organized, interacting, and eventually providing results. Your log shared by 2029-02-29 might indicate "something is broken" in the setup of Open Babel accessible to you, because issuing the same command as you yields in my instance (OpenBabel 3.1.1. as provided by Debian) only the following: $ obabel 3lcs.pdb -O 3lcs_prot.pdb -p
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 3lcs.pdb)
1 molecule converted Speculation: the PLIB tool's file
|
Hi @nbehrnd, thanks so much for trying to figure this out with me. I'm also limited by my rather limited C++knowledge. 😅 I already adapted the requirements file (on my machine PLIP is using OpenBabel 3.1.1 and there is only the one OpenBabel installation), fought my battle with the OpenBabel Python bindings 3.1.1.1 issue and have PLIP and OpenBabel running in a virtual environment. When you run the command with Thanks and have a lovely weekend! |
@kalinni With $ obabel 3lcs.pdb -O 3lcs_prot.pdb -p
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 3lcs.pdb)
1 molecule converted
$ grep UNK 3lcs_prot.pdb -c
457
$ grep UNL 3lcs_prot.pdb -c
1500 So I resort to AWK to extract them separately (attached below), i.e. $ awk '{if ($0 ~ /UNK/) print}' 3lcs_prot.pdb > UNK.pdb
$ awk '{if ($0 ~ /UNL/) print}' 3lcs_prot.pdb > UNL.pdb but their individual display e.g., in Jmol does not look pretty. For a protein structure I anticipate a couple of unbounded / unlinked molecules of water, perhaps one/a few small ligands; but there are bit too many atoms/too large ensembles to feel comfortable here. In addition, (UNK_detail), there are a couple of motifs in the structure with added hydrogens which merit a check. E.g., the cyclopropane -- not that it is impossible, but for its ring strain less likely to be seen in a naturally occurring compound. The same frame equally features a carbon exceeding tetravalence .and. hydrogens seemingly sharing the same position; which is not good, chemically speaking. The small molecules the report mentions as independent from the protein are not clearly visible (yet). So I thought one could split the original file (not yet protonated) into molecules Open Babel recognizes as separate: $ obabel 3lcs.pdb --separate -O fragment.pdb -m
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 3lcs.pdb)
142 molecules converted
142 files output. The first is fragment1.pdb There are some which are large: $ wc -l fragment*.pdb | sort -k 1 -rn | head
11135 total
5517 fragment.pdb
1422 fragment1.pdb
894 fragment19.pdb
832 fragment9.pdb
502 fragment17.pdb
468 fragment7.pdb
274 fragment5.pdb
224 fragment11.pdb
200 fragment13.pdb -- fragment13.pdb includes the/a cyclopropane -- yet quite a number which are small (water), too: $ wc -l fragment*.pdb | sort -k 1 -n | grep "5 frag" -c
122
$ wc -l fragment*.pdb | sort -k 1 -n | head
5 fragment100.pdb
5 fragment101.pdb
5 fragment102.pdb
5 fragment103.pdb
5 fragment104.pdb
5 fragment105.pdb
5 fragment106.pdb
5 fragment107.pdb
5 fragment108.pdb
5 fragment109.pdb There should be a better way to address the hydrogenation. |
@kalinni Separate thought: perhaps your workflow does not require the addition
of the complete set of hydrogens either; in a separate conversation I learned
the (ligand) molecules submitted to /in silico/ docking carry hydrogens on
hydrogen donors like -OH, -NH2, but lack hydrogens on the alkyl skeleton.
(But I'm not a user of vina, either.)
|
Hi @nbehrnd, thanks for taking such a deep dive into this. Actually I am working specifically on PLIP and 3lcs.pdb is only an example file I am currently using for testing. I am looking into some issues regarding protonation - PLIP currently calls However, I am getting a hydrogen atom added to one of the oxygens in glutamic acids side chain which I don't think should be there in physiological conditions. I was experimenting with using Regarding that cyclopropane - it is a methionine in the original pdb file, I guess open babel messes it up even more than I first noticed. |
I think the problem is that the function for correcting for pH reperceives the chains and residues. I made a PR that will change this, if you are comfortable with building from source you could try that one out and see if it helps. |
Awesome! Sounds like this could easily explain the results I am getting. So far I have not been building OpenBabel from source, but I'll definitely give it a go in the upcoming days and let you know here how it went. Thanks! |
@kalinni The page https://open-babel.readthedocs.io/en/latest/Installation/install.html compiles (...) some hints. I agree it merits to pull a chair to digest the steps required. |
Hi all, |
@kalinni A postscript: in my pane of «suggested projects», I just encountered a link to reduce by the Richardson group. It might be an interesting checkpoint for your reduction/addition of hydrogen atoms to .pdb for the options and report back to the CLI the program provides (including warnings of possible bumps). Running this once without one of the optional flags (cf. archive below) in the general pattern of $ reduce input.pdb > output.pdb in an instance of Linux Debian to yield a There is one instance of two carbon atoms pretty close / partially overlapping each other (cf. |
First of all thanks for working on and maintaining OpenBabel!
I want to add hydrogens to my pdb file. If I use
obabel 3lcs.pdb -O 3lcs_prot.pdb -h
everything looks great in general but the results don't match my expectations - for example one of the oxygens in the side chain of glutamic acid (e.g. GLU 1197) gets a hydrogen. In physiological conditions (from my understanding) it doesn't have a hydrogen there and is charged.So I moved on to use
-p
instead of-h
to get hydrogens at pH 7.4. Now from what I can see the hydrogens are added following my expectation, but the pdb file is messed up.(I am using OpenBabel within a Python project, but as the behavior is the same from the command line I used the command line options for easier reproducibility.)
Am I missing a preprocessing step or misunderstanding how these options should be used?
Thanks for your time and help!
Environment Information
Open Babel version: 3.1.1
Operating system and version: Windows 10
Expected Behavior
running
obabel 3lcs.pdb -O 3lcs_prot.pdb -p
should give me my pdb file with hydrogens added but residue numbers and names unchanged.Actual Behavior
running
obabel 3lcs.pdb -O 3lcs_prot.pdb -p
changes residue numbers to start at 1 and changes the residue names for everything it doesn't recognize as a standard amino acid or has some other issues with to UNK or UNL.(From what I see alternative locations and missing atoms seem to cause issues and any ligands in the file also lose their name.)
Steps to Reproduce
download the pdb file and run the commands (also happens with other pdb files though)
The text was updated successfully, but these errors were encountered: