Skip to content

hsiaoyi0504/awesome-cheminformatics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

Awesome Cheminformatics Awesome

Cheminformatics (also known as chemoinformatics, chemioinformatics and chemical informatics) is the use of computer and informational techniques applied to a range of problems in the field of chemistry.— Wikipedia

A curated list of awesome Cheminformatics software, resources, and libraries. Mostly command line based, and free or open-source. Please feel free to contribute !

Contents

Applications

Visualization

  • PyMOL - Python-enhanced molecular graphics tool.
  • Jmol - Browser-based HTML5 viewer and stand-alone Java viewer for chemical structures in 3D.
  • VMD - Molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
  • Chimera - Highly extensible program for interactive molecular visualization and analysis. Source is available.
  • ChimeraX - The next-generation molecular visualization program, following UCSF Chimera. Source is available here.
  • DataWarrior - A program for data Visualization and analysis which combines dynamic graphical views and interactive row filtering with chemical intelligence.

Command Line Tools

  • Open Babel - Chemical toolbox designed to speak the many languages of chemical data.
  • MayaChemTools - Collection of Perl and Python scripts, modules, and classes that support day-to-day computational discovery needs.
  • Packmol - Initial configurations for molecular dynamics simulations by packing optimization.
  • BCL::Commons

Docking

  • AutoDock Vina - Molecular docking and virtual screening.
  • smina - Customized AutoDock Vina to better support scoring function development and high-performance energy minimization.

Virtual Machine

  • myChEMBL - A version of ChEMBL built using Open Source software (Ubuntu, PostgreSQL, RDKit)
  • 3D e-Chem Virtual Machine - Virtual machine with all software and sample data to run 3D-e-Chem Knime workflows

Libraries

General Purpose

  • RDKit - Collection of cheminformatics and machine-learning software written in C++ and Python.
  • Indigo - Universal molecular toolkit that can be used for molecular fingerprinting, substructure search, and molecular visualization written in C++ package, with Java, C#, and Python wrappers.
  • CDK (Chemistry Development Kit) - Algorithms for structural chemo- and bioinformatics, implemented in Java.
  • ChemmineR - Cheminformatics package for analyzing drug-like small molecule data in R.
  • ChemPy - A Python package useful for chemistry (mainly physical/inorganic/analytical chemistry)
  • MolecularGraph.jl - A graph-based molecule modeling and chemoinformatics analysis toolkit fully implemented in Julia
  • datamol: - Molecular Manipulation Made Easy. A light wrapper build on top of RDKit.
  • CGRtools - Toolkit for processing molecules, reactions and condensed graphs of reactions. Can be used for chemical standardization, MCS search, tautomers generation with backward compatibility to RDKit and NetworkX.

Format Checking

Visualization

  • Kekule.js - Front-end JavaScript library for providing the ability to represent, draw, edit, compare and search molecule structures on web browsers.
  • 3Dmol.js - An object-oriented, webGL based JavaScript library for online molecular visualization.
  • JChemPaint - Chemical 2D structure editor application/applet based on the Chemistry Development Kit.
  • rdeditor - Simple RDKit molecule editor GUI using PySide.
  • nglviewer - Interactive molecular graphics for Jupyter notebooks.
  • RDKit.js - Official JavaScript distribution of cheminformatics functionality from the RDKit - a C++ library for cheminformatics.

Molecular Descriptors

  • mordred - Molecular descriptor calculator based on RDKit.
  • DescriptaStorus - Descriptor computation(chemistry) and (optional) storage for machine learning.
  • mol2vec - Vector representations of molecular substructures.
  • Align-it - Align molecules according their pharmacophores.
  • Rcpi - R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions.

Machine Learning

  • DeepChem - Deep learning library for Chemistry based on Tensorflow
  • Chemprop - Directed message passing neural networks for property prediction of molecules and reactions with uncertainty and interpretation.
  • ChemML - ChemML is a machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data. (based on Tensorflow)
  • olorenchemengine - Molecular property prediction with unified API for diverse models and respresentations, with integrated uncertainty quantification, interpretability, and hyperparameter/architecture tuning.
  • OpenChem - OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend.
  • DGL-LifeSci - DGL-LifeSci is a DGL-based package for various applications in life science with graph neural network.
  • chainer-chemistry - A Library for Deep Learning in Biology and Chemistry.
  • pytorch-geometric - A PyTorch library provides implementation of many graph convolution algorithms.
  • chemmodlab - A Cheminformatics Modeling Laboratory for Fitting and Assessing Machine Learning Models in R.
  • Summit - A python package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).

Web APIs

Databases

Docking

  • Rosetta - A comprehensive software suite for modeling macromolecular structures. Used larely for protein-protein docking.
  • DOCKSTRING - Automates and standardizes ligand preparation for AutoDock Vina.

Molecular Dynamics

  • Gromacs - Molecular dynamics package mainly designed for simulations of proteins, lipids and nucleic acids.
  • OpenMM - High performance toolkit for molecular simulation including extensive language bindings for Python, C, C++, and even Fortran.
  • NAMD - a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
  • MDTraj - Analysis of molecular dynamics trajectories.
  • cclib - Parsers and algorithms for computational chemistry logfiles.
  • ProDy - A Python package for protein dynamics analysis

Others

  • eiR - Accelerated similarity searching of small molecules
  • OPSIN - Open Parser for Systematic IUPAC nomenclature
  • Cookiecutter for Computational Molecular Sciences - Python-centric Cookiecutter for Molecular Computational Chemistry Packages by MolSSL
  • Auto-QChem - an automated workflow for the generation and storage of DFT calculations for organic molecules.
  • Gypsum-DL - a program for converting 2D SMILES strings to 3D models.
  • RDchiral - Wrapper for RDKit's RunReactants to improve stereochemistry handling
  • confgen - Webapp for generating conformers

Journals

Resources

Courses

Blogs

Books

See Also

License

CC0