ProDAR

ProDAR enhances protien function prediction and extracts Dynamically Activated Residues (DARs) using the dynamical information obtained from normal mode analysis (NMA). The code is published with Encoding protein dynamic information in graph representation for functional residue identification.

[arXiv] [CRPS]

Hierarchy

├── data
│   ├── data-graphs.ipynb
│   ├── data-graphs.py
│   ├── data-sifts.ipynb
│   ├── data-sifts.py
│   ├── graphs-10A
│   ├── nma-anm
│   ├── pdbs
│   ├── pis
│   └── sifts
│       ├── mf_go_codes-allcnt.dat
│       ├── mf_go_codes-thres-50.dat
│       ├── mf_go_codes-thres-50.npy
│       ├── pdb_chains.dat
│       ├── pdbmfgos-thres-50.json
│       ├── sifts-err-1.log
│       └── sifts-err-2.log
├── datasets
│   └── dataset.py
├── evaluation_kfold.py
├── experiment_kfold.py
├── models
│   └── multilabel_classifiers
│       ├── GAT.py
│       ├── GCN.py
│       └── GraphSAGE.py
├── prodar-env.yml
└── prodar.py

Environment

Clone environment from prodar-env.yml using miniconda:

conda env create -f environment.yml

Install PyG package via pip wheel:

pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric

where ${TORCH} and ${CUDA} should be repalced by the PyTorch and CUDA version (TORCH=1.10.0 and CUDA=cu113 for this specific environment).

Extra packages (if not installed by previous steps) may be installed via pip wheel.

Data

To preprocess data and generate protein graphs, execute the first script to download raw data from RCSB PDB search API and PDBe SIFTS API, and execute the second script to export filtered PDB and GO entries as JSON graphs.

Execute data-sifts.py

python data-sifts.py

Execute data-graphs.py

python data-graphs.py

For the above two steps, *.ipynb files are provided for markdown and optional visualization when jupyter lab/notebook is used.

Run

Experiment (currently only k-fold cross validation)

python experiment_kfold.py <options>

Evaluation (currently execute all saved models in `history/`)

python evaluation_kfold.py

Citing

If you happen to use the scripts, analyses, models, results or partial snippet of this work and find it useful, please cite the associated paper

@article{chiang2022encoding,
  title={Encoding protein dynamic information in graph representation for functional residue identification},
  author={Chiang, Yuan and Hui, Wei-Han and Chang, Shu-Wei},
  journal={Cell Reports Physical Science},
  volume={3},
  number={7},
  pages={100975},
  year={2022},
  publisher={Elsevier}
}

License

TBD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

datasets

datasets

models/multilabel_classifiers

models/multilabel_classifiers

README.md

README.md

evaluation_kfold.py

evaluation_kfold.py

experiment_kfold.py

experiment_kfold.py

prodar-env.yml

prodar-env.yml

prodar.py

prodar.py

Repository files navigation

ProDAR

Hierarchy

Environment

Data

Run

Experiment (currently only k-fold cross validation)

Evaluation (currently execute all saved models in `history/`)

Citing

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
datasets		datasets
models/multilabel_classifiers		models/multilabel_classifiers
README.md		README.md
evaluation_kfold.py		evaluation_kfold.py
experiment_kfold.py		experiment_kfold.py
prodar-env.yml		prodar-env.yml
prodar.py		prodar.py

chiang-yuan/ProDAR

Folders and files

Latest commit

History

Repository files navigation

ProDAR

Hierarchy

Environment

Data

Run

Experiment (currently only k-fold cross validation)

Evaluation (currently execute all saved models in history/)

Citing

License

About

Topics

Resources

Stars

Watchers

Forks

Languages

Evaluation (currently execute all saved models in `history/`)