ProDAR enhances protien function prediction and extracts Dynamically Activated Residues (DARs) using the dynamical information obtained from normal mode analysis (NMA). The code is published with Encoding protein dynamic information in graph representation for functional residue identification.
├── data
│ ├── data-graphs.ipynb
│ ├── data-graphs.py
│ ├── data-sifts.ipynb
│ ├── data-sifts.py
│ ├── graphs-10A
│ ├── nma-anm
│ ├── pdbs
│ ├── pis
│ └── sifts
│ ├── mf_go_codes-allcnt.dat
│ ├── mf_go_codes-thres-50.dat
│ ├── mf_go_codes-thres-50.npy
│ ├── pdb_chains.dat
│ ├── pdbmfgos-thres-50.json
│ ├── sifts-err-1.log
│ └── sifts-err-2.log
├── datasets
│ └── dataset.py
├── evaluation_kfold.py
├── experiment_kfold.py
├── models
│ └── multilabel_classifiers
│ ├── GAT.py
│ ├── GCN.py
│ └── GraphSAGE.py
├── prodar-env.yml
└── prodar.py
- Clone environment from
prodar-env.yml
using miniconda:
conda env create -f environment.yml
- Install PyG package via pip wheel:
pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric
where ${TORCH}
and ${CUDA}
should be repalced by the PyTorch and CUDA version (TORCH=1.10.0
and CUDA=cu113
for this specific environment).
- Extra packages (if not installed by previous steps) may be installed via pip wheel.
To preprocess data and generate protein graphs, execute the first script to download raw data from RCSB PDB search API and PDBe SIFTS API, and execute the second script to export filtered PDB and GO entries as JSON graphs.
- Execute
data-sifts.py
python data-sifts.py
- Execute
data-graphs.py
python data-graphs.py
For the above two steps,
*.ipynb
files are provided for markdown and optional visualization when jupyter lab/notebook is used.
python experiment_kfold.py <options>
python evaluation_kfold.py
If you happen to use the scripts, analyses, models, results or partial snippet of this work and find it useful, please cite the associated paper
@article{chiang2022encoding,
title={Encoding protein dynamic information in graph representation for functional residue identification},
author={Chiang, Yuan and Hui, Wei-Han and Chang, Shu-Wei},
journal={Cell Reports Physical Science},
volume={3},
number={7},
pages={100975},
year={2022},
publisher={Elsevier}
}
TBD