Skip to content

tomarovsky/Biocrutch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Biocrutch

License: MIT

Сoncept scripts for bioinformatics.

Content:

  • SRAtoolkit. The program parses the link from the Sequence Read Archive (SRA) and allows you to download reads in the sra format. The program also checks the integrity of the finished reads by parsing the required metrics of the source files.
  • Pseudoautosomal region. A script for determining the coordinates of the pseudo-autosomal region on the sex chromosome. The output is a BED file with the coordinates of the pseudoautosomal region.
  • RepeatMasking scripts. Scripts for converting TRF, RepeatMasker and WindowMasker output to GFF format.
  • EMA to FASTQ. Combines Ema output BINfiles into reverse FASTQ, forward FASTQ and barcode-only files.
  • QuastCore. The program is an alternative to the publicly available Quast program. Its main differences are:
    1. adding only the necessary cutoffs.
    2. counting missing N values.
    3. the output of the program is a pandas dataframe used for further analysis.
    4. output to a convenient csv file format. And others.
  • Coverage statistics. Script for calculating median, average, maximum and minimum coverage. Script works with the output of Bedtools Genomecov and Mosdepth programs.
    1. calculate stats for whole genome.
    2. calculate stats for each scaffold.
    3. calculate stats stacking windows.
  • PSMC data combine. The script combines data from several PSMC outputs to draw multiple demographic population histories on a single graph.

To use the Biocrutch package, you need to add the package path to PYTHONPATH.

Copyright (c) 2020 Andrey Tomarovsky

Releases

No releases published

Packages

No packages published

Languages