Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match enum with RDKit in the SMILES converter #472

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Kh4L
Copy link
Contributor

@Kh4L Kh4L commented Jan 17, 2024

Signed-off-by: Serge Panev <spanev@nvidia.com>
Copy link
Contributor

@weihua916 weihua916 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for your question. Basic question: Why is it necessary and how does this affect the model performance?

Also for backward compatibility, I'd like to use these additional features given some flag argument.

@Kh4L
Copy link
Contributor Author

Kh4L commented Jan 31, 2024

Hi! Thanks for your question. Basic question: Why is it necessary and how does this affect the model performance?

This change allow us to represent all of the possible hybridization and edgebond types.

The goal is to match the smile converter implem in PyG https://github.com/pyg-team/pytorch_geometric/blob/51c57da51d45c19497d393cc678c215eb341b3b4/torch_geometric/utils/smiles.py#L7

Also for backward compatibility, I'd like to use these additional features given some flag argument.

What kind of flag are you thinking of?

@weihua916
Copy link
Contributor

weihua916 commented Feb 1, 2024

Got it --- matching with PyG and making the graph expressive is good, but we also don't want to disturb others who have built models on the current version, right?

Why can't people just use PyG's smiles2graph and apply it to PCQMv2 data? You can easily pass your favorite smiles2graph here.

@Kh4L
Copy link
Contributor Author

Kh4L commented Feb 2, 2024

Well, PyG's from_smile returns a Data graph and not a dict of np arrays, so this would not work as it is

What we could do is to check the type of graph here and convert it to a dict of arrays

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants