A collection of code snippets and scripts related to my bioinformatics works
- fasta_split_size.py
Python script to split a multi-sequence fasta file into smaller ones with similar sub-total sizes. For example, this can be used to split 1 Mb fasta file containing 100 sequences into smaller ones of size around 1 Kb. The resulting fasta files will have similar sub-total size of 1 Kb and may contain different number of sequences. Modified from this script
- splitFasta.pl
Perl script to split a fasta file into chunks of certain size or number of sequence. Grabbed from DeconSeq.
Usage:
perl splitFasta.pl [-h] [-help] [-version] [-man] [-verbose] [options]
Examples:
perl splitFasta.pl -verbose -i file.fasta -s 2 #chunks of 2MB
perl splitFasta.pl -verbose -i file.fasta -n 10 #10 chunks