You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a task that requires USPTO with only sanitizable molecules but also with CXSMILES information retained. However, if I keep CXSMILES, the "remove_unsanitizable" pipeline step tries to sanitize products together with CXSMILES and naturally fails, which results in 700k reactions being invalidated. It would be nice if the product SMILES never ended up containing CXSMILES when being processed by RDKit, even if CXSMILES were not removed.
The text was updated successfully, but these errors were encountered:
Thanks for your feedback.
This is on our to-do list. It would naturally be better to parse the CXSMILES correct, which would entail employing level-1 parenthesis around SMILES that should be considered as one molecule.
I have a task that requires USPTO with only sanitizable molecules but also with CXSMILES information retained. However, if I keep CXSMILES, the "remove_unsanitizable" pipeline step tries to sanitize products together with CXSMILES and naturally fails, which results in 700k reactions being invalidated. It would be nice if the product SMILES never ended up containing CXSMILES when being processed by RDKit, even if CXSMILES were not removed.
The text was updated successfully, but these errors were encountered: