Skip to content

kiharalab/NuFold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NuFold: 3D RNA Structure Prediction Method

nufold

NuFold is a state-of-the-art method designed for predicting 3D RNA structures, leveraging deep learning for high accuracy and reliability. This tool is particularly useful for biologists and bioinformatics researchers focusing on RNA function and structure.

License: GPL v3 for academic use. (For commercial use, please contact us for different licensing) Contact: Daisuke Kihara (dkihara@purdue.edu)

Cite: Kagaya, Y., Zhang, Z., Ibtehaz, N., Wang, X., Nakamura, T., Huang, D., & Kihara, D. (2023). NuFold: A Novel Tertiary RNA Structure Prediction Method Using Deep Learning with Flexible Nucleobase Center Representation. bioRxiv.

Online Platform:

  1. Google Colab

Environment Setup and Installation

1. Conda Environment

Start by setting up a dedicated Conda environment:

conda create -n nufold_P python=3.10
conda activate nufold_P

2. PyTorch and Related Libraries

Install the latest version of PyTorch and associated libraries with CUDA support for optimized performance:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

3. Additional Dependencies

Install the necessary Python packages for NuFold:

pip install ml-collections dm-tree deepspeed protobuf scipy biopython numpy shutil

4. Aria2 for Downloading

For efficient downloading of large files, install Aria2:

apt-get install aria2

rMSA Configuration

Clone rMSA and set up the database:

git clone https://github.com/pylelab/rMSA
cd rMSA/database/
aria2c -q -R -x 16 -j 20 -s 65536 -c --optimize-concurrent-downloads https://kiharalab.org/nufold/database.zip
unzip database.zip && rm database.zip
cd ../..

IPknot Setup

IPknot is used for RNA secondary structure prediction. Download and set it up with the following commands:

wget https://github.com/satoken/ipknot/releases/download/v1.1.0/ipknot-1.1.0-x86_64-linux.zip
unzip ipknot-1.1.0-x86_64-linux.zip && rm ipknot-1.1.0-x86_64-linux.zip
chmod +x ipknot-1.1.0-x86_64-linux/ipknot

Model Checkpoint

Download the NuFold model checkpoint to a designated directory:

mkdir -p checkpoints
wget -O checkpoints/global_step145245.pt http://kiharalab.org/nufold/global_step145245.pt

Running NuFold with the End-to-End Script

We have created a new script that simplifies the process of running NuFold. To predict RNA structures with NuFold using the end-to-end script, follow these steps:

  1. Make the nufold.py script executable:
chmod +x nufold.py
  1. Run NuFold by providing the RNA sequence as a command-line argument:
./nufold.py your_sequence

The script will automatically generate a random job name, create the necessary directories, perform data preprocessing, run NuFold, and save the output in the nufold_output.zip file.

About

NuFold: End-to-End Approach for RNA Tertiary Structure Prediction with Flexible Nucleobase Center Representation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published