sequencing

A repository of algorithms for genomic sequencing.

Algorithms

Boyer-Moore

The repository currently consists of an implementation of the Boyer-Moore algorithm for exact alignment of a sequencing read to a reference genome. The implementation uses a linear-time algorithm (the "Z algorithm") as described by Gusfield (lecture notes 4 & 5) to construct indexes of the query string used for the good suffix and bad character rules.

Tests

The full test suite consists of tests for correctness and performance of the implementation of the Boyer-Moore string matching algorithm, using the Naïve string matching algorithm as a baseline comparison.

Test cases for correctness are available in tests. The following command will convert these test cases into the FASTA format required to run the full test suite:

./format_test.sh

Test cases for performance must be downloaded. The reference genomes are from the Dec. 2013 (GRCh38/hg38) assembly of the human genome; chromosomes 1 and 20 were used and can be downloaded here and here respectively; place the (decompressed) downloaded files in the root directory of the repository. The query string is included in FASTA format in the file p.fa and is taken from Ben Langmead's slides.

The full test suite can then be run with the following command:

./run_test.sh

Acknowledgements

Ben Langmead's course was immensely helpful, and was a guide throughout developing this repository and learning about genomic sequencing.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
Makefile		Makefile
README.md		README.md
align.cpp		align.cpp
align.h		align.h
format_test.sh		format_test.sh
p.fa		p.fa
run_test.sh		run_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests

tests

.gitignore

.gitignore

.travis.yml

.travis.yml

Makefile

Makefile

README.md

README.md

align.cpp

align.cpp

align.h

align.h

format_test.sh

format_test.sh

p.fa

p.fa

run_test.sh

run_test.sh

Repository files navigation

sequencing

Algorithms

Boyer-Moore

Tests

Acknowledgements

About

Releases

Packages

Languages

jacobjinkelly/sequencing

Folders and files

Latest commit

History

Repository files navigation

sequencing

Algorithms

Boyer-Moore

Tests

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages