Computer_aided_drug_discovery_kit

This pipeline provides a way to perform pharmaceutical compounds virtual screening using similarity-based analysis, ligand-based and structure-based techniques.The pipeline contains a collections of modules to perform a variety of analysis.

ML_1: Use of ChEMBL database to extract data (IC50, canonical smiles) from it for compounds associated with a target of interest. Data sets can be used for many cheminformatics tasks, eg. similarity search and clustering or machine learning.

In this notebook you will find compounds which were tested against a specific target and filtering available bioactivity data. Then, for every single compound, lipinki's descriptors are calculated, together with Padel's decriptors.

ML_2:

ML_3: Unwanted substructures filtering and substructure statistics.

ML_4: Similarity-based virtual screening.

ML_5:

ML_6:

ML_7: Machine learning generation module using Padel descriptors as features. Data extracted from Chembl are used to train a RandomForest classifier to determine the correct class of origin of new unseen compounds (pharmaceutical screening). Data are previous split into 3 classes (low, medium, high) of bioactivity level. RandomForest model is then validate using a ratio 20:80 test set and metrics are evaluate using cross-validation methods. The module allows you to change ML parameters to obtain a good predictor for your target of interest. Metrics and statistics (confusion matrix, eg.) can be obtain.

ML_7.1: prediction module. the joblib object generated in the previous model is used to predict the class of every new compound.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
10_binding_site_similarity_off_target.ipynb		10_binding_site_similarity_off_target.ipynb
11_API_webservices.ipynb		11_API_webservices.ipynb
14_pocket_detection.ipynb		14_pocket_detection.ipynb
15_protein_ligand_docking.ipynb		15_protein_ligand_docking.ipynb
16_protein_ligand_interactions.ipynb		16_protein_ligand_interactions.ipynb
19_molecular_dynamics.ipynb		19_molecular_dynamics.ipynb
2_Filtering_Ro5.ipynb		2_Filtering_Ro5.ipynb
3_filtering_substructures.ipynb		3_filtering_substructures.ipynb
4_ligand_based_screening.ipynb		4_ligand_based_screening.ipynb
5_compound_clustering.ipynb		5_compound_clustering.ipynb
6_maximum_common_substructure.ipynb		6_maximum_common_substructure.ipynb
7_ligandbased_screening.ipynb		7_ligandbased_screening.ipynb
8_protein_data_acquisition_from_PDB.ipynb		8_protein_data_acquisition_from_PDB.ipynb
9_ligandbased_pharmacophores.ipynb		9_ligandbased_pharmacophores.ipynb
Data_aquisition_ChEMBL.ipynb		Data_aquisition_ChEMBL.ipynb
LICENSE		LICENSE
ML_1.ipynb		ML_1.ipynb
ML_2.ipynb		ML_2.ipynb
ML_3.ipynb		ML_3.ipynb
ML_4.ipynb		ML_4.ipynb
ML_5.ipynb		ML_5.ipynb
ML_6.ipynb		ML_6.ipynb
ML_7.ipynb		ML_7.ipynb
ML_7_1_prediction.ipynb		ML_7_1_prediction.ipynb
README.md		README.md
requirements.txt		requirements.txt

License

francescopatane96/Computer_aided_drug_discovery_kit

Folders and files

Latest commit

History

Repository files navigation

Computer_aided_drug_discovery_kit

About

Topics

Resources

License

Stars

Watchers

Forks

Languages