ReAlign-N: an integrated realignment approach for multiple nucleic acid sequence alignment, combining global and local realignments
ReAlign-N is a tool written in C++11 for realigning the multiple nucleic acid sequence alignment. It runs on Linux.
1.Install WSL for Windows. Instructional video 1 or 2 (Copyright belongs to the original work).
2.Download and install Anaconda. Download Anaconda for different systems here. Instructional video of anaconda installation 1 or 2 (Copyright belongs to the original work).
3.Install ReAlign-N.
#1 Create and activate a conda environment for ReAlign-N
conda create -n realign_n_env
conda activate realign_n_env
#2 Add channels to conda
conda config --add channels malab
#3 Install ReAlign-N
conda install -c malab realign_n
#4 Test ReAlign-N
realign_n -h
- Download and Compile the source code. (Make sure your version of gcc >= 9.4.0)
#1 Download
git clone https://github.com/malabz/ReAlign-N.git
#2 Open the folder
cd ReAlign-N
#3 Compile
make
#4 Test ReAlign-N
./realign_n -h
Usage: /.realign_n [-r] path [-a] path [-o] path [-m] mode
Necessary arguments:
-r Specify the path of raw data, a file in FASTA format.
-a Specify the path of initial alignment, a file in FASTA format.
Optional arguments:
-o Specify the output for ReAlign-N, a file in FASTA format.
-m Specify the mode of ReAlign-N (default mode: 1).
1 for local realignment followed by global realignment.
2 for global realignment followed by local realignment.
-h Print the help message.
Dataset | Sequences Num | Repeats Num | Avg Length | Similarity |
---|---|---|---|---|
16s simu | 100 | 9 | about 1550bp | 14 sets of data with different similarities (99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%) |
mt simu | 100 | 9 | about 16000bp | 14 sets of data with different similarities (99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%) |
sars2 simu | 100 | 9 | about 29000bp | 14 sets of data with different similarities (99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%) |
CIPRES-128 | 255 | 9 | about 1550bp | The average similarity is about 80% |
CIPRES-256 | 511 | 9 | about 1550bp | The average similarity is about 80% |
CIPRES-512 | 1023 | 9 | about 1550bp | The average similarity is about 80% |
CIPRES-1024 | 2047 | 9 | about 1550bp | The average similarity is about 80% |
# Download data
wget http://lab.malab.cn/soft/ReAlign-N/data/16s_like.tar.gz
# Unzip data
tar -zxvf 16s_like.tar.gz
# Get the folder path
cd 16s_like
# Run ReAlign-N
./realign_n -r raw_data/16s_similarity_70_1.fas -a msa_results/16s_similarity_70_1_clustalo.fas -o 16s_similarity_70_1_clustalo_realign_n.fas -m 1
- Currently ReAlign-N is ONLY available for DNA/RNA.
- Ensure that the sequence ID entered into ReAlign is unique.
- MAFFT installation is required for the utilization of ReAlign-N.
System | GCC version |
---|---|
Linux | GCC 9.4.0 |
WSL | GCC 9.4.0 |
The software tools are developed and maintained by 🧑🏫ZOU's lab.
If you find any bug, welcome to contact us on the issues page or email us at 👉📩.
More tools and infomation can visit our github.