Skip to content
Genscale Team edited this page Jun 9, 2017 · 7 revisions

Welcome to the GATB-Core wiki

You can use the GATB-Core library to develop new NGS data analysis softwares.

GATB-Core natively provides the following high-performance and memory-efficient operations

Reads handling:

  • FASTA/FASTQ parsing and writing (plain text and gzipped files are supported)
  • Parallel iteration of sequences

K-mer:

  • K-mer counting
  • Minimizer computation of k-mers, partitioning of datasets by minimizers
  • Bloom data structure of k-mers
  • Hash table of k-mers
  • Minimal perfect hash function of k-mers
  • Arbitrarily large k-mers representations

de Bruijn graph:

  • graph construction
  • graph traversal operations (contigs, unitigs)
  • graph simplifications for assembly (tip removal, bulge removal)

Other optimized data structures

In addition to the de Bruijn graph data structure, GATB-Core provides several other ones that can be of interest for general purpose developments. These are:

Audience

The GATB-CORE library is intended to be used by developers having skills in c++ programming.

We also provide a Python 3 wrapper to GATB-Core c++ APIs: pyGATB.

Documentation

Start your discovery of the library with:

Contact GATB-Core Devel Team

In addition, feel free to contact the GATB-Core devel team if you have any questions regarding the use of the library.