Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't construct surface rmgpy.molecule.Molecule() #2603

Open
sevyharris opened this issue Feb 5, 2024 · 2 comments · May be fixed by #2607
Open

Can't construct surface rmgpy.molecule.Molecule() #2603

sevyharris opened this issue Feb 5, 2024 · 2 comments · May be fixed by #2607
Assignees

Comments

@sevyharris
Copy link
Contributor

Bug Description

I'm trying to create a rmgpy.molecule.Molecule() using smiles

rmgpy.molecule.Molecule(smiles='O=C=[X]')

But it fails from an RDKit error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_18284/1886713532.py in <module>
----> 1 rmgpy.molecule.Molecule(smiles='O=C=[X]')
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.__init__()
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._read()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._rdkit_translator()
ValueError: Could not interpret the identifier 'O=C=[X]'

Possible Solution

I'm wondering if we can get around RDKit's limitations with the following procedure:

  • replace the surface sites X with a nonreactive species like Ar
  • generate that molecule's adjacency list
  • replace Ar with X and delete extra pairs of electrons on the X

It would look something like adding this to the _rdkit_translator() function in RMG-Py/rmgpy/molecule/translator.py:

from rdkit import Chem
import rmgpy.molecule

COX_smiles = 'O=C=[X]'

rdkitmol = Chem.MolFromSmiles(COX_smiles.replace('X', 'Ar'))
mol = rmgpy.molecule.molecule.Molecule()
output = rmgpy.molecule.converter.from_rdkit_mol(mol, rdkitmol)

lines = output.to_adjacency_list().split('\n')

for i, line in enumerate(lines):
    if 'Ar' in line:
        lines[i] = lines[i].replace('Ar', 'X')
        # remove any extra electron pairs...
        lines[i] = lines[i].replace('p3', 'p0')
        lines[i] = lines[i].replace('p2', 'p0')
        lines[i] = lines[i].replace('p1', 'p0')
adj_list = '\n'.join(lines)
COX = rmgpy.molecule.molecule.Molecule().from_adjacency_list(adj_list)

print(COX.smiles)
print(COX.to_adjacency_list())

Other Possible Solution

One workaround is to just instantiate the molecule using the adjacency list. But sometimes that's annoying- especially since we can use smiles for gas phase molecules without any issue. At the very least, a more descriptive error message is probably required, telling the user that you have to use the adjacency list.

@mjohnson541
Copy link
Contributor

If the atom swapping doesn't work well. I have an julia code sitting around from several years ago that I wrote for going from smiles to a julia equivalent of the molecule class. If someone were interested in transcoding that to cython I think that would make it trivial to add this feature and any other modifications we like to smiles=>molecule in the future.

@sevyharris
Copy link
Contributor Author

If the atom swapping doesn't work well. I have an julia code sitting around from several years ago that I wrote for going from smiles to a julia equivalent of the molecule class. If someone were interested in transcoding that to cython I think that would make it trivial to add this feature and any other modifications we like to smiles=>molecule in the future.

I might be interested. It looks like there's some weird recursions going on that are complicating my attempts to implement this at the _rdkit_translator level

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants