This computational tool is designed for plant long non-coding RNA identification, and namely ItLnc-BXE ( 'Identification of plant lncRNAs using a Bagging-XGBoost-ensemble method with multiple features').
Datasets and tools we used are available at: https://pan.baidu.com/s/1M3zAve936BBbReoaFL8ZEQ
1.To use this tool, clone this repository on your machine by:
git clone https://github.com/BioMedicalBigDataMiningLab/ItLnc-BXE.git
2.Download datasets and tools
2.1 Download '/data' and '/feamodule.zip' from given url
2.2 Unzip feamodule.zip to /ItLnc-BXE/src/feamodule/
2.3 Unzip /ItLnc-BXE/src/feamodule/blast.zip to /ItLnc-BXE/src/feamodule/blast
To use this tool you will need:
-
linux
-
python3
- numpy
- pandas
- biopython
- scikit-learn
- xgboost
- deap
- CPAT
-
python2
- numpy
- regex
- biopython
- scikit-learn
- scipy
-
nodejs
Run:
python3 ItLnc-BXE.py -i input.fa -m a -o result
Model selection -m
:
a : Arabidopsis_thaliana
c : Chlamydomonas_reinhardtii
h : Hordeum_vulgare
o : Oryza_sativa_Japonica_Group
p : Physcomitrella_patens
s : Solanum_tuberosum
all_features.csv
: 175 features extracted from input transcripts.
result
: the classes of predicted input trainscipts.