Skip to content

darkreactions/serendipity_recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serendipity Recommender

Code and data related to serendipity recommender paper

Training data

Located under data/training

Historical training data

Located under data/training/historical. This folder contains 2 files.

  1. raw_data.csv - Contains the raw training file generated for all descriptors generated by Escalate
  2. training_data.csv - Historical training data set used by all models

Amine specific training data

Located under data/training/initialization There are four folders each corresponding to the 4 amines used to refine the models in the lab

  • HJFYRMFYQMIZDG-UHFFFAOYSA-N - Hydroxyphenethyl amine
  • JMXLWMIFDJCGBV-UHFFFAOYSA-N - Dimethylammonium Iodide
  • NJQKYAASWUGXIT-UHFFFAOYSA-N - 4-Chlorophenylammonium Iodide
  • ZKRCWINLLKOVCL-UHFFFAOYSA-N - 4-Chlorophenethylammonium Iodide

Each amine folder contains training draws named training_draw0.csv and training_draw1.csv

Statset

Located under data/stateset Just like amine specific initialization, there are four folders corresponding to each amine. Each amine folder contains 3 files:

  1. stateset.csv - Stateset of all possible concentrations along with their descriptors. This stateset is used during the active learning and final prediction phases
  2. stateset_volumes.csv - Reagent volumes to combine to get the concentrations defined in stateset.csv, used in the lab
  3. vertices.csv - Inorganic, organic and acid concentrations that represent the vertices of the explored stateset. Used to plot stateset

Results

Located under data/results/final_plate_observations The results folder contains the observations made in the final prediction plate by all models. There are two subfolders corresponding to the exploitation and serendipity recommenders

Source Code

Located under /src

Model code

Located under src/models. All open source models are provided in this repo. Classification models such as BART, DT, KNN and PLATIPUS are under src/models/classification and regression model such as BGP is under src/model/regression

Plotting code

Code to generate plots used in the paper are placed under src/plot

Serendipity recommender

Code to calculate serendipity is placed under src/recsys. Recommender code is available in preprocess.py and the objective function is available in objectives.py

Jupyter notebook

Located under /notebooks. Presents the code used to generate and present the results used in the paper

About

Code related to serendipity recommender paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published