Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bondnet not training or predicting #1

Open
ABehal2020 opened this issue Aug 8, 2022 · 17 comments
Open

bondnet not training or predicting #1

ABehal2020 opened this issue Aug 8, 2022 · 17 comments

Comments

@ABehal2020
Copy link

ABehal2020 commented Aug 8, 2022

I followed the instructions in the GitHub repo bondnet branch to use the updated version of bondnet.

When I tried to train, I get the following error:

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ python run.py model/decoder=regressor.yaml datamodule=regression/electrolyte.yaml
Traceback (most recent call last):
File "run.py", line 6, in
from rxnrep.train import train
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py", line 7, in
from pytorch_lightning import (
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/torchmetrics/utilities/data.py)

When I try to predict, I get the following error (I assume if the model is not trained locally then it should default to the pretrained version?):

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ bondnet examples/reactions_mrnet.json
Traceback (most recent call last):
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 33, in
sys.exit(load_entry_point('rxnrep', 'console_scripts', 'bondnet')())
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 25, in importlib_load_entry_point
return next(matches).load()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/predict.py", line 15, in
from rxnrep.data.electrolyte import ElectrolyteDataset
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 9, in
from rxnrep.data.datamodule import (
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/datamodule.py", line 5, in
from pytorch_lightning import LightningDataModule
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/metrics/utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/torchmetrics/utilities/data.py)

@mjwen
Copy link
Owner

mjwen commented Aug 9, 2022

Hi @ABehal2020 This seems to be a torchmetrics version issue. Can you check what's the version of torchmetrics in your environment? As specified in environment.yml, version 0.6.2 should work.

@ABehal2020
Copy link
Author

ABehal2020 commented Aug 9, 2022

I followed these instructions in the bondnet branch in this rxnrep repo as I would like to use the update bondnet:

git clone https://github.com/mjwen/rxnrep.git
cd rxnrep
git checkout bondnet
conda env create -f environment.yml
conda activate rxnrep
pip install -r requirements.txt
pip install -e .

My torchmetrics subsequently was 0.9.3 - I downgraded by running the following: conda install torchmetrics=0.6.2

I then tried to train the model by running the following:

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ python run.py model/decoder=regressor.yaml datamodule=regression/electrolyte.yaml
Traceback (most recent call last):
File "run.py", line 6, in
from rxnrep.train import train
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py", line 5, in
import wandb
ModuleNotFoundError: No module named 'wandb'
(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ conda install wandb

After installing wandb, I reran the training command: python run.py model/decoder=regressor.yaml

This gave me another stack trace:

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ python run.py model/decoder=regressor.yaml datamodule=regression/electrolyte.yaml
Using backend: pytorch
⚙ CONFIG
├── seed
│ └── 35
├── restore
│ └── False
├── skip_test
│ └── False
├── git_repo_path
│ └── None
├── return_val_metric_score
│ └── False
├── original_working_dir
│ └── /Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep
├── datamodule
│ └── regression:
target: rxnrep.data.electrolyte.ElectrolyteRegressionDataModule
│ batch_size: 100
│ num_workers: 0
│ pin_memory: true
│ state_dict_filename: dataset_state_dict.yaml
│ testset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_test.json
│ trainset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_train.json
│ valset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_val.json

├── model
│ └── decoder:
│ cfg_adjuster:
target: rxnrep.model.regressor.adjust_config
│ model_class:
target: rxnrep.model.regressor.LightningModel
│ property_name:
│ - reaction_energy
│ regression_decoder_hidden_layer_sizes:
│ - - 64
│ - 50
│ regression_decoder_num_layers:
│ - 2
│ encoder:
│ activation: ReLU
│ combine_reactants_products: difference
│ conv: GatedGCNConv
│ conv_layer_size: 64
│ embedding_size: 64
│ has_global_feats: true
│ mlp_diff_layer_batch_norm: true
│ mlp_diff_layer_sizes: []
│ mlp_pool_layer_batch_norm: true
│ mlp_pool_layer_sizes: []
│ molecule_batch_norm: true
│ molecule_conv_layer_sizes:
│ - 64
│ - 64
│ molecule_dropout: 0.0
│ molecule_num_fc_layers: 2
│ molecule_residual: true
│ num_mlp_diff_layers: 0
│ num_mlp_pool_layers: 0
│ num_mol_conv_layers: 2
│ num_rxn_conv_layers: 0
│ pool_atom_feats: true
│ pool_bond_feats: false
│ pool_global_feats: false
│ pool_kwargs: null
│ pool_method: attentive_reduce_sum
│ reaction_batch_norm: true
│ reaction_conv_layer_sizes: []
│ reaction_dropout: 0.0
│ reaction_num_fc_layers: 2
│ reaction_residual: true

├── optimizer
│ └── lr: 0.001
│ lr_scheduler:
│ epochs: 10
│ lr_min: 1.0e-06
│ lr_warmup_step: 10
│ scheduler_name: cosine
│ weight_decay: 1.0e-06

├── trainer
│ └── target: pytorch_lightning.Trainer
│ max_epochs: 10
│ num_sanity_val_steps: 2
│ progress_bar_refresh_rate: 100
│ resume_from_checkpoint: null
│ sync_batchnorm: true
│ weights_summary: top

├── logger
│ └── wandb:
target: pytorch_lightning.loggers.wandb.WandbLogger
│ id: null
│ project: tmp-rxnrep

└── callbacks
└── early_stopping:
target: pytorch_lightning.callbacks.EarlyStopping
min_delta: 0
mode: max
monitor: val/score
patience: 100
verbose: true
model_checkpoint:
target: pytorch_lightning.callbacks.ModelCheckpoint
mode: max
monitor: val/score
save_last: true
save_top_k: 3
verbose: false

Global seed set to 35
[2022-08-09 18:17:12,828][/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py][INFO] - Instantiating datamodule: rxnrep.data.electrolyte.ElectrolyteRegressionDataModule
[2022-08-09 18:17:14,898][mip.model][INFO] - Using Python-MIP package version 1.13.0
[2022-08-09 18:17:14,910][rxnrep.data.electrolyte][INFO] - Start reading dataset file...
Traceback (most recent call last):
File "run.py", line 64, in
main()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
_run_hydra(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
run_and_report(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in
lambda: hydra.run(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
return run_job(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/core/utils.py", line 127, in run_job
ret.return_value = task_function(task_cfg)
File "run.py", line 60, in main
train(cfg_final)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py", line 62, in train
datamodule.setup()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 385, in wrapped_fn
return fn(*args, **kwargs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 157, in setup
self.data_train = ElectrolyteDataset(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/dataset.py", line 500, in init
super().init(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/dataset.py", line 90, in init
self.reactions, self._failed = self.read_file(filename)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 48, in read_file
return read_electrolyte_dataset(filename, self.nprocs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 31, in read_electrolyte_dataset
succeed_reactions, failed = read_mrnet_reaction_dataset(filename, nprocs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/io.py", line 110, in read_mrnet_reaction_dataset
mrnet_reactions = loadfn(filename)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/monty/serialization.py", line 63, in loadfn
with zopen(fn, "rt") as fp:
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/monty/io.py", line 45, in zopen
return open(filename, *args, **kwargs) # pylint: disable=R1732
FileNotFoundError: [Errno 2] No such file or directory: '/Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_train.json'

I appreciate any help. Thank you!

@ABehal2020
Copy link
Author

I believe these 3 JSON files need to be pushed to the bondnet branch for rxnrep repo for local training to work on my computer:

trainset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_train.json
valset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_val.json
testset_filename: /Users/mjwen/Documents/Dataset/electrolyte/reactions_n2000_test.json

Thank you!

@mjwen
Copy link
Owner

mjwen commented Aug 10, 2022

ah, thanks for catching this! Just added the dataset to the repo. Please pull the latest bondnet branch to get it.

@mjwen
Copy link
Owner

mjwen commented Aug 10, 2022

I followed these instructions in the bondnet branch in this rxnrep repo as I would like to use the update bondnet:

git clone https://github.com/mjwen/rxnrep.git cd rxnrep git checkout bondnet conda env create -f environment.yml conda activate rxnrep pip install -r requirements.txt pip install -e .

My torchmetrics subsequently was 0.9.3 - I downgraded by running the following: conda install torchmetrics=0.6.2

I then tried to train the model by running the following:

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ python run.py model/decoder=regressor.yaml datamodule=regression/electrolyte.yaml Traceback (most recent call last): File "run.py", line 6, in from rxnrep.train import train File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py", line 5, in import wandb ModuleNotFoundError: No module named 'wandb' (rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ conda install wandb

After installing wandb, I reran the training command: python run.py model/decoder=regressor.yaml

Thanks for the detailed info. torchmetrics==0.6.2 was specified as a dependence in the main branch but not in the bondnet branch. This has been fixed. Also, wandb is added as a dependence.

@ABehal2020
Copy link
Author

ABehal2020 commented Aug 10, 2022

Thanks! I updated my local rxnrep folder with the latest bondnet branch changes.

New stack trace when running model training command locally on my Mac (prediction on pretrained model works):
(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ python run.py model/decoder=regressor.yaml datamodule=regression/electrolyte.yaml
Using backend: pytorch
⚙ CONFIG
├── seed
│ └── 35
├── restore
│ └── False
├── skip_test
│ └── False
├── git_repo_path
│ └── None
├── return_val_metric_score
│ └── False
├── original_working_dir
│ └── /Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep
├── datamodule
│ └── regression:
target: rxnrep.data.electrolyte.ElectrolyteRegressionDataModule
│ batch_size: 100
│ num_workers: 0
│ pin_memory: true
│ state_dict_filename: dataset_state_dict.yaml
│ testset_filename: ./dataset/reactions_n2000_test.json
│ trainset_filename: ./dataset/reactions_n2000_train.json
│ valset_filename: ./dataset/reactions_n2000_val.json

├── model
│ └── decoder:
│ cfg_adjuster:
target: rxnrep.model.regressor.adjust_config
│ model_class:
target: rxnrep.model.regressor.LightningModel
│ property_name:
│ - reaction_energy
│ regression_decoder_hidden_layer_sizes:
│ - - 64
│ - 50
│ regression_decoder_num_layers:
│ - 2
│ encoder:
│ activation: ReLU
│ combine_reactants_products: difference
│ conv: GatedGCNConv
│ conv_layer_size: 64
│ embedding_size: 64
│ has_global_feats: true
│ mlp_diff_layer_batch_norm: true
│ mlp_diff_layer_sizes: []
│ mlp_pool_layer_batch_norm: true
│ mlp_pool_layer_sizes: []
│ molecule_batch_norm: true
│ molecule_conv_layer_sizes:
│ - 64
│ - 64
│ molecule_dropout: 0.0
│ molecule_num_fc_layers: 2
│ molecule_residual: true
│ num_mlp_diff_layers: 0
│ num_mlp_pool_layers: 0
│ num_mol_conv_layers: 2
│ num_rxn_conv_layers: 0
│ pool_atom_feats: true
│ pool_bond_feats: false
│ pool_global_feats: false
│ pool_kwargs: null
│ pool_method: attentive_reduce_sum
│ reaction_batch_norm: true
│ reaction_conv_layer_sizes: []
│ reaction_dropout: 0.0
│ reaction_num_fc_layers: 2
│ reaction_residual: true

├── optimizer
│ └── lr: 0.001
│ lr_scheduler:
│ epochs: 10
│ lr_min: 1.0e-06
│ lr_warmup_step: 10
│ scheduler_name: cosine
│ weight_decay: 1.0e-06

├── trainer
│ └── target: pytorch_lightning.Trainer
│ max_epochs: 10
│ num_sanity_val_steps: 2
│ progress_bar_refresh_rate: 100
│ resume_from_checkpoint: null
│ sync_batchnorm: true
│ weights_summary: top

├── logger
│ └── wandb:
target: pytorch_lightning.loggers.wandb.WandbLogger
│ id: null
│ project: tmp-rxnrep

└── callbacks
└── early_stopping:
target: pytorch_lightning.callbacks.EarlyStopping
min_delta: 0
mode: max
monitor: val/score
patience: 100
verbose: true
model_checkpoint:
target: pytorch_lightning.callbacks.ModelCheckpoint
mode: max
monitor: val/score
save_last: true
save_top_k: 3
verbose: false
Global seed set to 35
[2022-08-09 22:46:15,494][/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py][INFO] - Instantiating datamodule: rxnrep.data.electrolyte.ElectrolyteRegressionDataModule
[2022-08-09 22:46:17,836][mip.model][INFO] - Using Python-MIP package version 1.13.0
[2022-08-09 22:46:17,847][rxnrep.data.electrolyte][INFO] - Start reading dataset file...
Traceback (most recent call last):
File "run.py", line 64, in
main()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
_run_hydra(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
run_and_report(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in
lambda: hydra.run(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
return run_job(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/core/utils.py", line 127, in run_job
ret.return_value = task_function(task_cfg)
File "run.py", line 60, in main
train(cfg_final)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/train.py", line 62, in train
datamodule.setup()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 385, in wrapped_fn
return fn(*args, **kwargs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 157, in setup
self.data_train = ElectrolyteDataset(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/dataset.py", line 500, in init
super().init(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/dataset.py", line 90, in init
self.reactions, self._failed = self.read_file(filename)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 48, in read_file
return read_electrolyte_dataset(filename, self.nprocs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/electrolyte.py", line 31, in read_electrolyte_dataset
succeed_reactions, failed = read_mrnet_reaction_dataset(filename, nprocs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/rxnrep/data/io.py", line 110, in read_mrnet_reaction_dataset
mrnet_reactions = loadfn(filename)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/monty/serialization.py", line 63, in loadfn
with zopen(fn, "rt") as fp:
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/monty/io.py", line 45, in zopen
return open(filename, *args, **kwargs) # pylint: disable=R1732
FileNotFoundError: [Errno 2] No such file or directory: '/Users/adityabehal/Documents/RPI/RCOS/IBM-quantum-computing/rxnrep/outputs/2022-08-09/22-46-14/dataset/reactions_n2000_train.json'

@mjwen
Copy link
Owner

mjwen commented Aug 10, 2022

Modify electrolyte.yaml to set the trainset_filename, valset_filename and testset_filename to the absolute path of the data files on your machine. The data files are located at the dataset directory of the repo.

@ABehal2020
Copy link
Author

Thanks! Local training works now. I'd like to use my locally trained model to make predictions. When I train the model and then run prediction, here's the output:

(rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ bondnet examples/reactions_mrnet.json
Using backend: pytorch

Find model directory ./rxnrep_model; will reuse it.
Finish making predictions. Results written to result.json. The bond dissociation energy is denoted by predicted_reaction_energy in each reaction.

I believe it is still using the pretrained model that was pulled earlier from Google Drive. How can I make the predict command use the locally trained model?

@mjwen
Copy link
Owner

mjwen commented Aug 10, 2022

Yes, it is using the pretrained model. Instructions to your own model added to README.

@ABehal2020
Copy link
Author

ABehal2020 commented Aug 12, 2022

Thanks! I updated the datamodule/regression/electrolyte.yaml file local paths to point to the train, test, and validation mrnet files on my computer. I have a few questions:

  1. While training the model, I only get one test point that is viewable on Wandb. Does this correspond to the average of all the test dataset points to depict the average test loss after the model training is finished?
  2. Also, what does the trainer/global_step on the x axis correspond to? Is there a way I can see training and validation learning curves graphed on an epoch by epoch basis?
  3. While trying to generate predictions from my locally trained model, I get the following stack trace:
    (rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ bondnet examples/reactions_mrnet.json --model outputs/2022-08-12/22-42-44

Traceback (most recent call last):
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 33, in
sys.exit(load_entry_point('rxnrep', 'console_scripts', 'bondnet')())
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 22, in importlib_load_entry_point
for entry_point in distribution(dist_name).entry_points
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/importlib/metadata.py", line 503, in distribution
return Distribution.from_name(distribution_name)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/importlib/metadata.py", line 177, in from_name
raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: rxnrep

Screen Shot 2022-08-12 at 2 45 58 PM

Screen Shot 2022-08-12 at 2 45 53 PM

Screen Shot 2022-08-12 at 2 45 36 PM

@mjwen
Copy link
Owner

mjwen commented Aug 15, 2022

Thanks! I updated the datamodule/regression/electrolyte.yaml file local paths to point to the train, test, and validation mrnet files on my computer. I have a few questions:

  1. While training the model, I only get one test point that is viewable on Wandb. Does this correspond to the average of all the test dataset points to depict the average test loss after the model training is finished?

It is correct behavior to see only one test error, because test is only performed at the ending of the training using the best performing model. Also, the test error is the mean absolute error for the entire test set. So, if you want to get prediction error for each individual points, you will need to use the bondnet predict command.

  1. Also, what does the trainer/global_step on the x axis correspond to? Is there a way I can see training and validation learning curves graphed on an epoch by epoch basis?

In training deep neural nets, we train in mini batches, the global step corresponds to the number of mini batch steps. Alternatively, you can also view the epochs (i.e. the number of iterations of passing your training data) by choosing it from the upper right dropdown from the wandb interface.

  1. While trying to generate predictions from my locally trained model, I get the following stack trace:
    (rxnrep) Adityas-MacBook-Pro:rxnrep adityabehal$ bondnet examples/reactions_mrnet.json --model outputs/2022-08-12/22-42-44

Have no immediate idea why this happens. Some guesses: 1. did you pip install the repo? 2. are you issuing the bondnet command in the repo?

@ABehal2020
Copy link
Author

  1. Got it - thank you!
  2. That's a great feature of Wandb - thanks for pointing that out to me.
  3. I forgot to pip install the repo - it works now.

Thank you for all your help!

@ABehal2020 ABehal2020 reopened this Nov 4, 2022
@ABehal2020
Copy link
Author

Hi,

I really love rxnrep's bondnet implementation, and I've enjoyed working with it so far. I have a couple more questions:

  1. Local training with mrnet and smiles strings work, but when I try to use either locally trained model for prediction, I get this stack trace:

(rxnrep) cary-50:rxnrep adityabehal$ bondnet /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/examples/reactions_mrnet.json --model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-04-smiles
Using backend: pytorch
Using local model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-04-smiles to make predictions
Traceback (most recent call last):
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 33, in
sys.exit(load_entry_point('rxnrep', 'console_scripts', 'bondnet')())
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/predict.py", line 208, in cli
prepare_model(model)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/predict.py", line 173, in prepare_model
checkpoint_path = get_wandb_checkpoint_path(identifier, train_directory)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/utils/wandb.py", line 264, in get_wandb_checkpoint_path
raise RuntimeError(f"Cannot found job {identifier} in {path}")
RuntimeError: Cannot found job None in /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-04-smiles

It was working earlier, but after some update to the bondnet branch of this rxnrep repo, it's not working and I'm not sure if this is an issue on my end.

  1. Although I know rxnrep's bondnet branch is for prediction of BDEs, I'm wondering if it's possible to use rxnrep's bondnet branch to predict individual molecular properties. For example, I have a simple solubility dataset. The original version for smiles strings only had the SMILES representation of the molecule. since rxnrep was expecting a reaction with the reactants and products separated by the delimiter "<<", I extended my dataset. For example, given the SMILES string ClCC(Cl)CCl I changed it to ClCC(Cl)CCl>>ClCC(Cl)CCl and passed it to rxnrep's bondnet architecture.

I got the following stack trace:

[2022-11-04 17:38:57,788][/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/train.py][INFO] - Instantiating datamodule: rxnrep.data.nrel.NRELDataModule
[2022-11-04 17:38:59,925][mip.model][INFO] - Using Python-MIP package version 1.13.0
[2022-11-04 17:39:00,048][rxnrep.data.nrel][INFO] - Start reading dataset ...
[2022-11-04 17:39:01,950][rxnrep.data.nrel][INFO] - Finish reading dataset. Number succeed 9582, number failed 0.
[2022-11-04 17:39:01,950][rxnrep.data.dataset][INFO] - Starting building graphs and featurizing...
Traceback (most recent call last):
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/grapher.py", line 350, in build_graph_and_featurize_reaction
atom_map_number = reaction.get_reactants_atom_map_number(zero_based=True)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/core/reaction.py", line 298, in get_reactants_atom_map_number
return self._get_atom_map_number(self.reactants, zero_based)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/core/reaction.py", line 595, in _get_atom_map_number
minimum = int(np.min(np.concatenate(atom_map_number)))
File "<array_function internals>", line 180, in amin
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 2918, in amin
return _wrapreduction(a, np.minimum, 'min', axis, None, out,
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: '<=' not supported between instances of 'NoneType' and 'NoneType'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "run.py", line 64, in
main()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
_run_hydra(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
run_and_report(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in
lambda: hydra.run(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
return run_job(
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/hydra/core/utils.py", line 127, in run_job
ret.return_value = task_function(task_cfg)
File "run.py", line 60, in main
train(cfg_final)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/train.py", line 62, in train
datamodule.setup()
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 385, in wrapped_fn
return fn(*args, **kwargs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/nrel.py", line 74, in setup
self.data_train = NRELDataset(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 500, in init
super().init(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 149, in init
self.dgl_graphs = self.build_graph_and_featurize()
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 284, in build_graph_and_featurize
dgl_graphs = [
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 285, in
build_graph_and_featurize_reaction(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/grapher.py", line 376, in build_graph_and_featurize_reaction
raise RuntimeError(
RuntimeError: Cannot build graph and featurize for reaction: CN(C)C(=O)C1=CC=C(Cl)C=C1>>CN(C)C(=O)C1=CC=C(Cl)C=C1_index-0

I'm wondering if rxnrep can predict individual molecular properties as is or if I'll need to modify it to do so.

I've attached a zip file of the test, train, and val solubility dataset tsv files I'm passing in
solubility-data-files.zip
if you're interested in taking a quick look.

Thank you for all your help so far!

@mjwen
Copy link
Owner

mjwen commented Nov 5, 2022

  1. The model directory --model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-04-smiles seems a bit suspicious to me. Can you check whether there is a wandb directory in it? The prediction code will look for a wandb directory, where the trained model is located.

  2. The short answer is no. The current rxnrep and bondnet are for reaction properties, and not for molecular properties. It is possible to modify it to predict molecular properties, but since there are a lot of other great repos to do it, I am not sure whether you want to do it or not. As an example, check out dgl-lifesci for molecular properties prediction.

  3. Out of curiosity, is the solubility data set you linked from some published paper or public database?

@ABehal2020
Copy link
Author

ABehal2020 commented Nov 5, 2022

  1. Thanks - it's working now with the mrnet trained model for predicting mrnet data - out of curiosity if I train the model with smiles data will it only be able to predict smiles data or can it predict mrnet data as well (and vice versa for the mrnet trained model)? Stack trace below:

// using locally trained mrnet model for prediction on mrnet data (will it work with smiles data too?)
(rxnrep) cary-50:rxnrep adityabehal$ bondnet /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/examples/reactions_mrnet.json --model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-05/15-28-33
Using backend: pytorch
Using local model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-05/15-28-33 to make predictions
Finish making predictions. Results written to result.json. The bond dissociation energy is denoted by predicted_reaction_energy in each reaction.
// using locally trained smiles model for prediction (fails on mrnet data - does it only work on smiles data?)
(rxnrep) cary-50:rxnrep adityabehal$ bondnet /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/examples/reactions_mrnet.json --model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-05/15-29-30
Using backend: pytorch
Using local model /Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/outputs/2022-11-05/15-29-30 to make predictions
Traceback (most recent call last):
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/bin/bondnet", line 33, in
sys.exit(load_entry_point('rxnrep', 'console_scripts', 'bondnet')())
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/adityabehal/opt/anaconda3/envs/rxnrep/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/predict.py", line 210, in cli
main(model, data_filename)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/predict.py", line 145, in main
data_loader = get_dataloader(dataset_state_dict, data_filename)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/predict.py", line 44, in get_dataloader
dataset = ElectrolyteDataset(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 500, in init
super().init(
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 166, in init
self.scale_features()
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/dataset.py", line 345, in scale_features
feature_scaler.transform(graphs)
File "/Users/adityabehal/Documents/RPI/RCOS/IBM-ML-Chemistry/rxnrep/rxnrep/data/scaler.py", line 160, in transform
feats = (torch.cat(feats) - m) / s
RuntimeError: The size of tensor a (12) must match the size of tensor b (35) at non-singleton dimension 1

  1. Thanks for letting me know! I'll check it out.

  2. The solubility dataset is publicly available at https://github.com/whitead/dmol-book/raw/master/data/curated-solubility-dataset.csv - I originally saw it in the dmol tutorials (https://dmol.pub/dl/gnn.html).

@mjwen
Copy link
Owner

mjwen commented Nov 6, 2022

Models trained using smiles can be used for mrnet and vice versa. In prediction, we need to initialize the model and then load the trained model. But the default parameters (e.g. atom feature dimension) for smiles and mrnet input are different, then you see the RuntimeError: The size of tensor a (12) must match the size of tensor b (35) at non-singleton dimension 1 error.

More specifically, take a look at this and this. They need to be updated to be the same, and then it should work.

And, thanks for the dataset source!

@ABehal2020
Copy link
Author

Thank you - I have some questions about smiles string formatting.

  1. I noticed that your SMILES data is differently formatted than the solubility SMILES data (e.g. your SMILES data has a lot more brackets). Is there a way to go between these 2 forms or a converter? Or this just a different convention for representing smiles reactions?

  2. Also, given a SMILES string for a reaction, how can I separate it into reactants and products? Did you just use the ">>" as the delimiter between the reactants and products side?

  3. Is there a way to indicate the solvent for reactions or individual molecules overall?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants