Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrecomputedDescriptorFromFile doesn't accept stereo smiles? #3

Open
jeremycheminf opened this issue Mar 18, 2024 · 2 comments
Open

Comments

@jeremycheminf
Copy link

jeremycheminf commented Mar 18, 2024

Hi
I'm not sure, here, but after spending some time and not understanding why my file was failing but test in data was, I used
Function to remove stereochemistry from SMILES

def remove_stereo(smiles):
    mol = Chem.MolFromSmiles(smiles)
    smiles_no_stereo = Chem.MolToSmiles(mol, isomericSmiles=False)
    return smiles_no_stereo

Then save the files with fp column and I could run the study; otherwise I got error with descriptors can not be calculated.
I have no idea why because other functions like "UnscaledPhyschemDescriptors" worked without any issues.

Another problem with the function, if the column we want to add is recognized as integer by pandas then it fails because the code is looking for float (I used value + 0.000001 to get around it)

lewismervin1 added a commit that referenced this issue Mar 25, 2024
@lewismervin1
Copy link
Collaborator

lewismervin1 commented Mar 25, 2024

Thanks @jeremycheminf, I do not understand why the Unscaled would pass but then the Scaled version would fail, either. Please could you share an example of the config json? I just wanted to check that the deduplication etc are the same, both between the run that worked and the run that failed/removed those molecules?

I also just pushed aacc080 which I hope should fix the specific issue wrt the datatype used in the precomputed descriptors

@jeremycheminf
Copy link
Author

jeremycheminf commented Mar 25, 2024

Hi
I managed to get this example using some smiles from the data folder and creating 2 compounds with stereo.
Running with stereo and precomputed fails, saving file with @ removed works, even if now 2 compounds in training don't have extra info. (notebook zipped so I can upload it)
Copy_Stereo.zip

Example_file_Qptuna.csv

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants