CANDIY-spectrum

Human analyis of chemical spectra such as Mass Spectra (MS), Infra-Red Specta (FTIR), and Nuclear Magnetic Resonance is both time consuming and potentially inaccurate. This project aims to develop a set of methodologies incorporating these spectra for the prediction of chemical functional groups and structures.

This project is a stub, but we hope that it will spur development of machine learning methods for the analysis of chemical spectra.

Required Packages

Numpy==1.18.5
Rdkit==2020.03.1
Pandas==1.0.5
Jcamp
Tensorflow==1.15.0
Matplotlib==3.1.1
Sklearn==0.22.1

Scraping

Manual Scraping

IR and MS spectra were downloaded from NIST website. https://webbook.nist.gov/chemistry/. Scraping can be done through replacing the correct CAS number in the placeholder. https://webbook.nist.gov/cgi/cbook.cgi?ID="insert_cas"&Units=SI and downloading the required spectra.

(Or)

Automatic Scraping

Download all the species name available in NIST from this link https://webbook.nist.gov/chemistry/download/. Change path of cas_list to where the species name file is stored.

python scrap.py --save_dir='./data/' --cas_list='species.txt' --scrap_IR=true --scrap_MS=true --scrap_InChi=true

Prepare dataset

Parse all jdx files of IR and Mass spectra to standardize and store in a csv format. Also, parse inchi.txt to create target csv indicating presence of functional groups

python prepare_load_dataset.py --data_dir='./data/' --cas_list='species.txt'

Train the model

Run train.py to train the model. The model directory should contain params.json file listing all hyperparameters used for building the model. Optionally weights can be restored from pretrained model.

python train.py --model_dir=./experiments/mlp_model --data_dir=./data/

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
experiments		experiments
model		model
.gitignore		.gitignore
README.md		README.md
prepare_load_dataset.py		prepare_load_dataset.py
scrap.py		scrap.py
synthesize_results.py		synthesize_results.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

model

model

.gitignore

.gitignore

README.md

README.md

prepare_load_dataset.py

prepare_load_dataset.py

scrap.py

scrap.py

synthesize_results.py

synthesize_results.py

train.py

train.py

Repository files navigation

CANDIY-spectrum

Required Packages

Scraping

Manual Scraping

Automatic Scraping

Prepare dataset

Train the model

About

Releases

Packages

Contributors 2

Languages

chopralab/candiy_spectrum

Folders and files

Latest commit

History

Repository files navigation

CANDIY-spectrum

Required Packages

Scraping

Manual Scraping

Automatic Scraping

Prepare dataset

Train the model

About

Resources

Stars

Watchers

Forks

Languages