Introduction

bacpage

This repository contains an easy-to-use pipeline for the assembly and analysis of bacterial genomes using ONT long-read or Illumina short-read technology. Read the complete documentation and instructions for bacpage and each of its functions here

Introduction

Advances in sequencing technology during the COVID-19 pandemic has led to massive increases in the generation of sequencing data. Many bioinformatics tools have been developed to analyze this data, but very few tools can be utilized by individuals without prior bioinformatics training.

This pipeline was designed to encapsulate pre-existing tools to automate analysis of whole genome sequencing of bacteria. Installation is fast and straightfoward. The pipeline is easy to setup and contains rationale defaults, but is highly modular and configurable by more advance users. Bacpage has individual commands to generate consensus sequences, perform de novo assembly, construct phylogenetic tree, and generate quality control reports.

Features

We anticipate the pipeline will be able to perform the following functions:

Reference-based assembly of Illumina paired-end reads
De novo assembly of Illumina paired-end reads
De novo assembly of ONT long reads
Run quality control checks
Variant calling using bcftools
Maximum-likelihood phylogenetic inference of processed samples and background dataset using iqtree
MLST profiling and virulence factor detection
Antimicrobial resistance genes detection
Plasmid detection

Installation

Install mamba by running the following two command:

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh

Clone the bacpage repository:

git clone https://github.com/CholGen/bacpage.git

Switch to the development branch of the pipeline:

cd bacpage/
git checkout -b split_into_command

Install and activate the pipeline's conda environment:

mamba env create -f environment.yaml
mamba activate bacpage

Install the bacpage command:

pip install .

Test the installation:

bacpage -h
bacpage version

These command should print the help and version of the program. Please create an issue if this is not the case.

Updating

Navigate to the directory where you cloned the bacpage repository on the command line:

cd bacpage/

Activate the bacpage conda environment:

mamba activate bacpage

Pull the lastest changes from GitHub:

git pull

Update the bacpage conda environemnt:

mamba env update -f environment.yaml

Reinstall the bacpage command:

pip install .

Usage

Activate the bacpage conda environment:

mamba activate bacpage

Create a directory specifically for the batch of samples you would like to analyze (called a project directory).

bacpage setup [your-project-directory-name]

Place paired sequencing reads in the input/ directory of your project directory.
From the pipeline's directory, run the reference-based assembly pipeline on your samples using the following command:

bacpage assemble [your-project-directory-name]

This will generate a consensus sequence in FASTA format for each of your samples and place them in <your-project-directory-name>/results/consensus_sequences/<sample>.masked.fasta. An HTML report containing alignment and quality metrics for your samples can be found at <your-project-directory-name>/results/reports/qc_report.html.

Name		Name	Last commit message	Last commit date
Latest commit History 293 Commits
.github		.github
bacpage		bacpage
test		test
.dockerignore		.dockerignore
.dockstore.yml		.dockstore.yml
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
environment_docker.yaml		environment_docker.yaml
environment_test.yaml		environment_test.yaml
pyproject.toml		pyproject.toml

License

CholGen/bacpage

Folders and files

Latest commit

History

Repository files navigation

Introduction

Features

Installation

Updating

Usage

About

Resources

License

Stars

Watchers

Forks

Languages