Skip to content

Bhattacharya-Lab/QDeep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QDeep

Distance-based protein model quality estimation using deep ResNets

Table of Contents

Getting Started

QDeep can be downloaded by typing

$ git clone https://github.com/Bhattacharya-Lab/QDeep.git

Prerequisites

  1. Linux system: QDeep is tested on x86_64 Linux system. Currently, QDeep is not supported on Windows or Mac
  2. python 3.6 or newer
  3. tensorflow 1.13.1 or newer
  4. keras 2.3.1 or newer
  5. numpy 1.14.5 or newer
  6. Pyrosetta Python-3.6.Release: pyrosetta-2019.45+release or newer (http://www.pyrosetta.org/dow)

Installation

  1. If you don't have python version 3.6, you can download from https://www.python.org/downloads/release/python-360/ and install.
  2. If you don't have "tensorflow" package, install it by typing $ pip install tensorflow
  3. If you don't have "keras" package, install it by typing $ pip install keras
  4. If you don't have "numpy" package, install it by typing $ pip install numpy
  5. If you don't have Pyrosetta installed, please download from http://www.pyrosetta.org/dow and install.
  6. Go to the directory where you downloaded QDeep and configure by typing:
$ cd QDeep
$ python configure.py

Usage

To run QDeep, type

$ python QDeep.py

You will see the following output

***************************************************************************
*                               QDeep                                     *
*   Distance-based protein model quality estimation using deep ResNets    *
*          For comments, please email to bhattacharyad@auburn.edu         *
***************************************************************************
usage: QDeep.py [-h] [--tgt TARGET_NAME] [--seq SEQ_FILE] [--dcy DECOY_DIR]
                [--aln ALN_FILE] [--dist DISTANCE_FILE] [--pssm PSSM_FILE]
                [--spd3 SPD33_FILE] [--msa YES] [--gpu DEVICE_ID]
                [--out OUTPUT_PATH]

Arguments:
  -h, --help            show this help message and exit
  --tgt TARGET_NAME     Target name
  --seq SEQ_FILE        Sequence file
  --dcy DECOY_DIR       Decoy directory
  --aln ALN_FILE        Multiple sequence alignment
  --dist DISTANCE_FILE  DMPfold predicted distance
  --pssm PSSM_FILE      PSSM file
  --spd3 SPD33_FILE     SPIDER3 output (.spd33)
  --msa YES             yes|no Whether to use deep MSA (default: no)
  --gpu DEVICE_ID       Device id (0/1/2/3/4/..) Whether to run on GPU
                        (default: CPU)
  --out OUTPUT_PATH     Output directory name

Example commands to run QDeep

QDeep can be run with both standard and deep MSA.

  • To run QDeep with standard MSA, type
$ cd QDeep
$ python QDeep.py --tgt T0865 --seq example/QDeep_standard/T0865.fasta --dcy example/QDeep_standard/T0865 --aln example/QDeep_standard/T0865.aln --dist example/QDeep_standard/rawdistpred.current --pssm example/QDeep_standard/T0865.pssm --spd3 example/QDeep_standard/T0865.spd33 --out T0865

Please check the log to match with your output for the above command. The output file for the above example can be found here.

  • To run QDeep with deep MSA, type
$ cd QDeep
$ python QDeep.py --tgt T0865 --seq example/QDeep_deep/T0865.fasta --dcy example/QDeep_deep/T0865 --aln example/QDeep_deep/T0865.aln --dist example/QDeep_deep/rawdistpred.current --pssm example/QDeep_deep/T0865.pssm --spd3 example/QDeep_deep/T0865.spd33 --msa yes --out T0865
  • For running QDeep, GPU is not required. However GPU may speed up the prediction. To run QDeep with GPU, type
$ cd QDeep
$ python QDeep.py --tgt T0865 --seq example/QDeep_standard/T0865.fasta --dcy example/QDeep_standard/T0865 --aln example/QDeep_standard/T0865.aln --dist example/QDeep_standard/rawdistpred.current --pssm example/QDeep_standard/T0865.pssm --spd3 example/QDeep_standard/T0865.spd33 --gpu 0 --out T0865

A detailed explanation for each of the options are provided below

  • --tgt Target name: This should be the name of target without having any extension.

  • --seq Sequence file: This should contain the sequence with and without the header. The sequence may also expands to multiple lines in the file

  • --dcy Decoy directory: This requires a directory containing all the pdb models with .pdb extension.

  • --aln Multiple Sequence Alignment file: The alignment file should be generated using HHblits with a query sequence coverage of 10% and pairwise sequence identity of 90% against uniclust30_2018_08 by three iterations with an E-value inclusion threshold of 10^-3. You can download HHblits from https://github.com/soedinglab/hh-suite.

  • --dist DMPfold predicted distance: To predict distance using DMPfold, you can download DMPfold from https://github.com/psipred/DMPfold

  • --pssm PSSM file: You can generate the sequence profile by searching the NR database using PSI-BLAST. You can download the PSI-BLAST from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

  • --spd3 SPIDER3 output: The secondary structure and the solvent accessibility should be predicted using SPIDER3. SPIDER3 can be downloaded from https://sparks-lab.org/downloads/

  • --msa yes|no: This is optional. You should use this flag if you want to use DeepMSA generated MSA. You can download DeepMSA from https://zhanglab.ccmb.med.umich.edu/DeepMSA/. When you use DeepMSA generated MSA, please make sure to,

    • generate sequence profile(PSSM) using deep MSA
    • run SPIDER3 using deep MSA to predict secondary structure and solvent accessibility
    • predict distance using DMPfold using deep MSA
  • --gpu Device ID: If you want to use GPU for the prediction, please use this flag and specify the device ID.

  • --out Output path: Please select a location for the output to be stored. It is recommended that you specify a directory name for the output.

Data

  1. Download input data for running QDeep with standard and deep alignments on both CASP12 and CASP13 datasets.
  2. Download QDeep predictions for CASP12 and CASP13 targets using standard alignments.
  3. Download QDeep predictions for CASP12 and CASP13 targets using deep alignments.
    Please refer to the readme.txt for more information about the file formats.

Cite

If you find QDeep useful, please cite our ISMB 2020 Proceedings paper published in Bioinformatics.