Skip to content

danielecook/seq-collection

Repository files navigation

docs Build

seq-collection

A collection of useful sequencing utilities.

Documentation

Development

As I have done in the past, I intend to develop seq-kit over a period of months or years. In general, this means I'll add a new tool or utility when it becomes apparent that I need it.

However, I am open to PRs, requests, and feedback. Please let me know what you think.

I intend to port some commands over from VCF-kit, but will broaden the applicability to FASTQs, BAMs, and perhaps even other items.

Commands

FASTA

  • fa-gc - Calculates GC content surrounding a given genomic position at specified window sizes.

FASTQ

  • fq-dedup - The fq-dedup command de-duplicates a FASTQ by read ID.
  • fq-count - Count the number of reads in a FASTQ and other metrics.
  • fq-meta - Provides basic information and a best guess as to sequencer from FASTQs.

BAM

  • insert-size - Calculates insert-size quickly.
  • json - Convert VCF output to JSON.
  • sample - Samples variants from a VCF.

Multi

  • iter - generates genomic ranges for parallelizing commands.
  • rand - Generates random sites and regions from FASTA, BAM, and VCF