- Bash scripts for basic DNA analysis.
- Most scripts will accept data in gbk format.
- In
genomes
directory there are few samples of mitochondrial DNA records without any metadata. weka
directory contains scripts that can come handy when classifying or clustering the data with Weka library (Java machine learning lib).
Picture represents a cluster of species, based on a mitochondrial dna.
Counts the number of same bit occurrences in a row for every bit (ACGT).
Plots tmp[number] data from tmp folder. Looks for animal names in tmp/names. USAGE: ./plotscript.sh "filenames" -> plots tmp1, tmp2 and tmp3 WARNING: Always use double quotes around filenames, even if you use regex
Plots tmp[number] data from tmp folder. Looks for animal names in tmp/names. USAGE: ./plotscript.sh "filenames" -> plots tmp1, tmp2 and tmp3
Plots tmp[number] data from tmp folder. Looks for animal names in tmp/names. USAGE: ./plotscript.sh "1 2 3" -> plots tmp1, tmp2 and tmp3
Outputs all combinations of adjecent characters of length $1 Reads from standard input
Prints all the species names from the gbk files in passed directory.
Prints species name from the gbk file. Gbk file needs to be piped in.
Prints only the gene sequence from the gbk file. File needs to be piped in.
deprecated All combinations of four bits
deprecated Four different bits
deprecated Four bits without a repetition
deprecated
deprecated
deprecated
deprecated
USAGE: ./sequencer5.sh "1 2 3"
USAGE: ./sequencer5.sh "1 2 3" Run the query on files, numbered in first argument. If combination doesn't exist, mark it 0. Order by global frequency of combination.
USAGE: ./.sh "filenames" WARNING: Always use double quotes around filenames, so they get treated as single argument. Only performs first part of operation and saves intermediate results in ./comb subfolder. Run the query on files, numbered in first argument. If combination doesn't exist, mark it 0. Create dirs if they don't exist remove path from fName progress bar Remove duplicates from names file
USAGE: ./.sh "filenames" <> ./sequencerGbk.sh "filenames" run the query on files, numbered in first argument if combination doesn't exist, mark it 0 order by global frequency of combination progress bar print all results side by side
USAGE: ./sequencerGbk.sh "filenames" run the query on files, numbered in first argument if combination doesn't exist, mark it 0 order by global frequency of combination progress bar
USAGE: ./sequencerGbk.sh "filenames" run the query on files, numbered in first argument if combination doesn't exist, mark it 0 order by global frequency of combination print all results side by side
make results the same length