Skip to content

Ryan-Rhys/Constrained-Bayesian-Optimisation-for-Automatic-Chemical-Design

Repository files navigation

Constrained Bayesian Optimisation for Automatic Chemical Design using Variational Autoencoders

Welcome to the code accompanying the paper "Constrained Bayesian Optimisation for Automatic Chemical Design using Variational Autoencoders"

https://pubs.rsc.org/en/content/articlehtml/2019/sc/c9sc04026a

The code is based heavily on the implementation of the Aspuru-Guzik group:

https://github.com/aspuru-guzik-group/chemical_vae

INSTALL

Append the package directory location to your PYTHONPATH e.g. by editing the .bashrc file as follows:

vim ~/.bashrc

and adding

export PYTHONPATH

source ~/.bashrc

It is recommended that you install dependencies within a virtual environment. For example, using conda you would run, from the Constrained_BO_package directory, the commands:

conda config --add channels conda-forge

(to add conda-forge to existing channels)


source activate env_name

conda install rdkit==2017.09.3

cd Theano-master

python setup.py install

cd ..

conda install numpy==1.13.0

pip install git+https://github.com/rgbombarelli/keras.git#egg=Keras

pip install git+https://github.com/rgbombarelli/seya.git#egg=seya

pip install git+https://github.com/HIPS/autograd.git#egg=autograd

USAGE

The scripts

generate_latent_features_and_targets_example.py generate_qed_features_and_targets.py generate_solo_qed.py

must be run first in order to create the features and targets for molecule generation.

  1. Branin_Hoo

Constrained Bayesian Optimisation on the toy Branin-Hoo function.

  1. Chemical_Design

The Unconstrained directory contains scripts that generate molecules using unconstrained Bayesian Optimisation. The Constrained directory contains scripts that generate molecules using constrained Bayesian Optimisation.

Within these directories there are 3 scripts optimising the following objectives:

a) bo_gp.py -> logP + SA + ring-penalty b) bo_gp_qed -> QED + SA + ring-penalty c) bo_gp_solo_qed -> QED

The Initialisation directory contains code to generate training data for the binary classification neural network in the scripts Pos_Gen.py and Neg_Gen.py. These scripts inteface with the make_training_data.py script in order to create the data.

Citing Constrained Bayesian Optimisation for Automatic Chemical Design

Sample Bibtex is given below:

@article{griffiths2020constrained,
  title={Constrained Bayesian optimization for automatic chemical design using variational autoencoders},
  author={Griffiths, Ryan-Rhys and Hern{\'a}ndez-Lobato, Jos{\'e} Miguel},
  journal={Chemical Science},
  year={2020},
  publisher={Royal Society of Chemistry}
}

About

Code to accompany the paper "Constrained Bayesian Optimisation for Automatic Chemical Design" https://pubs.rsc.org/en/content/articlehtml/2019/sc/c9sc04026a

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published