Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing the paper's result #35

Open
minju-hits opened this issue Jul 25, 2023 · 0 comments
Open

Reproducing the paper's result #35

minju-hits opened this issue Jul 25, 2023 · 0 comments

Comments

@minju-hits
Copy link

minju-hits commented Jul 25, 2023

Dear, @arneschneuing

I am currently working on structured-based drug design (SBDD) and have been deeply impressed by your remarkable performance of your molecule generation work.
Also, Thank you for sharing your code.

I have been attempting to reproduce the results presented in Table 1 of your paper, specifically focusing on the "CrossDocked DiffSBDD-inpaint (C-alpha)" using the provided checkpoint.

Here are the steps I followed:

  1. Create conda environment.
  2. Data preparation
    2.1. download the CrossDocked data from the Pocket2Mol GitHub repository.
    2.2. python process_crossdock.py <crossdocked_dir> --no_H
  3. Sample molecules for all pockets in the test set.

    python test.py checkpoints/ca_inpaint.ckpt --test_dir <crossdocked_dir>/processed_noH/test/ --outdir <output_dir> --fix_n_nodes
  4. Calculated the metrics with reference to your provided code.
from analysis.metrics import MoleculeProperties
mol_metrics = MoleculeProperties()
from rdkit import Chem
import glob

sdf_names = glob.glob("<output_dir>/test_set/processed/*.sdf")
pocket_mols_lst = []
for sdf_name in sdf_names:
    with Chem.SDMolSupplier(sdf_name) as suppl:
        pocket_mols = [x for x in suppl if x is not None]
    pocket_mols_lst.append(pocket_mols)

all_qed, all_sa, all_logp, all_lipinski, per_pocket_diversity = mol_metrics.evaluate(pocket_mols_lst)
print(len(pocket_mols_lst)) # 55
print([len(x) for x in pocket_mols_lst]) 
# [100, 97, 97, 93, 97, 99, 94, 98, 97, 94, 98, 98, 100, 98, 96, 97, 99, 95, 98, 98, 96, 97, 96, 96, 95, 97, 97, 97, 98, 94, 97, 97, 99, 98, 97, 98, 98, 97, 99, 99, 97, 96, 98, 99, 97, 97, 97, 99, 97, 97, 98, 92, 95, 89, 98]

My result is the below ( CrossDocked, DiffSBDD-cond (C-alpha)) and I attached my output file.
testset.tar.gz

5331 molecules from 55 pockets evaluated.
QED: 0.510 \pm 0.14
SA: 0.349 \pm 0.09
LogP: -0.295 \pm 0.97
Lipinski: 4.875 \pm 0.37
Diversity: 0.774 \pm 0.07

However, I couldn't obtain the same results as those mentioned in the paper. I also looked into the related issue . Unfortunately, it didn't provide a clear answer to my question.

I have a couple of questions that I hope you could assist me with:

  1. Could you provide the information how to accurately reproduce the results from Table 1? The 'test.py' script offers various options, and I'm uncertain about the correct settings to use in conjunction with the checkpoint to achieve the desired outcome.
    스크린샷 2023-07-25 오후 2 26 14

  2. The repository contains two checkpoints, yet the paper's Table 1 showcases four variation models.
    What are the types of variation for which you provided the two checkpoints? Additionally, could you also provide the other two checkpoints? Having access to this information would be helpful to replicate the findings.

If you require any additional information or have further questions, don't hesitate to reach out to me. Thank you for your time and consideration.

Best regards,
MinJu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant