NuFold is a state-of-the-art method designed for predicting 3D RNA structures, leveraging deep learning for high accuracy and reliability. This tool is particularly useful for biologists and bioinformatics researchers focusing on RNA function and structure.
License: GPL v3 for academic use. (For commercial use, please contact us for different licensing) Contact: Daisuke Kihara (dkihara@purdue.edu)
Online Platform:
Start by setting up a dedicated Conda environment:
conda create -n nufold_P python=3.10
conda activate nufold_P
Install the latest version of PyTorch and associated libraries with CUDA support for optimized performance:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
Install the necessary Python packages for NuFold:
pip install ml-collections dm-tree deepspeed protobuf scipy biopython numpy shutil
For efficient downloading of large files, install Aria2:
apt-get install aria2
Clone rMSA and set up the database:
git clone https://github.com/pylelab/rMSA
cd rMSA/database/
aria2c -q -R -x 16 -j 20 -s 65536 -c --optimize-concurrent-downloads https://kiharalab.org/nufold/database.zip
unzip database.zip && rm database.zip
cd ../..
IPknot is used for RNA secondary structure prediction. Download and set it up with the following commands:
wget https://github.com/satoken/ipknot/releases/download/v1.1.0/ipknot-1.1.0-x86_64-linux.zip
unzip ipknot-1.1.0-x86_64-linux.zip && rm ipknot-1.1.0-x86_64-linux.zip
chmod +x ipknot-1.1.0-x86_64-linux/ipknot
Download the NuFold model checkpoint to a designated directory:
mkdir -p checkpoints
wget -O checkpoints/global_step145245.pt http://kiharalab.org/nufold/global_step145245.pt
We have created a new script that simplifies the process of running NuFold. To predict RNA structures with NuFold using the end-to-end script, follow these steps:
- Make the
nufold.py
script executable:
chmod +x nufold.py
- Run NuFold by providing the RNA sequence as a command-line argument:
./nufold.py your_sequence
The script will automatically generate a random job name, create the necessary directories, perform data preprocessing, run NuFold, and save the output in the nufold_output.zip
file.