Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeSolv database is wrong #3682

Open
GregorySchwing opened this issue Nov 18, 2023 · 1 comment
Open

FreeSolv database is wrong #3682

GregorySchwing opened this issue Nov 18, 2023 · 1 comment

Comments

@GregorySchwing
Copy link

GregorySchwing commented Nov 18, 2023

馃悰 Bug

The dataset used to load the FreeSolv database is not correct.

To Reproduce

Steps to reproduce the behavior:

  1. Download csv of deepchem's FreeSolv
    https://deepchemdata.s3.us-west-1.amazonaws.com/datasets/freesolv.csv.gz
  2. Download csv of Mobley's FreeSolv csv
    https://github.com/MobleyLab/FreeSolv/blob/master/database.txt
  3. Cross reference the first molecule in deepchem by smile string
Deepchem
smiles                     y 
CN(C)C(=O)c1ccc(cc1)OC    -1.8744673709079
Mobley
smiles                     experimental value (kcal/mol) 
CN(C)C(=O)c1ccc(cc1)OC    -11.01

Expected behavior

Match reference.

@rbharath
Copy link
Member

It's possible the deepchem version may be pre-normalized. We should check by plotting the distributions.

CC @ARY2260

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants