Skip to content

Releases: kishwarshafin/pepper

PEPPER-Margin-DeepVariant r0.8 update

16 Mar 20:32
Compare
Choose a tag to compare

In this release:

  • Separate SNP and INDEL calling with DeepVariant: SNP calling with none and INDEL calling with rows.
  • Parameters to select manual SNP and INDEL models for DeepVariant.
  • Parameter separation to handle candidates in repeat regions within PEPPER.
  • Update and fix training documentation.
  • Update to DeepVariant version 1.3.0.

PEPPER-Margin-DeepVariant r0.7 update

21 Dec 22:34
Compare
Choose a tag to compare

Version r0.7 update

  • Detailed explanation of methods.
  • Detailed performance evaluation on ONT and PacBio-HiFi data.
  • Included training documentation for PEPPER-Margin-DeepVariant.
  • Examples on how to tune parameters to balance speed and accuracy
  • State-of-the-art results for all nanopore chemistry.

PEPPER-Margin-DeepVariant r0.6

03 Nov 03:24
Compare
Choose a tag to compare

Release 0.6 comes with these updates:

  • At least 3x runtime acceleration on Oxford Nanopore and 2x acceleration on PacBio-HiFi variant calling pipeline.
  • Support for R10.4 Q20 variant calling.
  • Wide range of parameters available for tuning PEPPER-DeepVariant to user's mode of usage.
  • Ability to provide customized models for PEPPER-DeepVariant.

Will be shortly added to this release:

  • Full training documentation on how to train a model end-to-end.
  • Documentation and explanation of the available parameters and their downstream effect in variant calling.

r0.5

26 Aug 21:31
Compare
Choose a tag to compare

Updates in Oxford Nanopore variant calling in V0.5:

Reduce search space of PEPPER by only predicting on sites with variants
Adding CNN layers on top of RNNs to improve predictions
Remove PEPPER HP from the pipeline
Support rows model for DeepVariant that uses alt alignment which significantly improves INDEL performance.
Support for Guppy 5.0.7 and high-accuracy mode of Guppy.

PEPPER v0.4 release for Zenodo

04 Aug 23:41
Compare
Choose a tag to compare

Archived release of v0.4. No updates.

PEPPER-Margin-DeepVariant release

05 Mar 16:16
Compare
Choose a tag to compare

This is the official release of PEPPER-Margin-DeepVariant. It supports the Nanopore and PacBio HiFi variant calling and assembly polishing pipelines.

Key highlights and improvements:

  • Candidate finding with PEPPER HP is implemented in a manner that is best suited for DeepVariant's image generation. You will not see any called variant with allele frequency 0 as the candidate finding is now synonymous with DeepVariant's candidate finding with RNN predictions used to rank the candidates. PEPPER-DeepVariant can be used as a standard small variant calling tool.
  • PEPPER SNP is improved and tuned to work with Margin in a manner that PEPPER-Margin produces the best haplotyping results for Oxford nanopore and PacBio HiFi data.
  • Overall we see a 30x improvement in the total runtime of the pipeline and PEPPER itself is 20x faster compared to r0.1. We are expecting more runtime improvements in the future.

Discontinued features:

  • PEPPER is not supported as a standalone assembly polishing tool. We believe the PEPPER-Margin-DeepVariant pipeline is much more sensitive to the structure of the assembly and provides a way to avoid over-polishing an assembly by filtering the VCF. This feature in itself is missing from PEPPER so we dropped haploid assembly polishing and overall assembly polishing with PEPPER alone.

PEPPER v0.1 release

09 Oct 20:11
Compare
Choose a tag to compare

PEPPER v0.1 release notes (haploid assembly polisher)

PEPPER is a recurrent neural network-based haploid genome assembly polisher. This is the first release of the haploid assembly polishing component of PEPPER. We tested PEPPER's performance on several human genome samples, Zymo microbial community samples, and non-model organisms. The performance of PEPPER suggests that we can achieve highly accurate genome assemblies using ONT reads only.

Installation

PEPPER is available via pip to install.

python3 -m pip install pepper-polish
# if you get permission error, then try:
python3 -m pip install --user pepper-polish

python3 -m pepper.pepper --help
python3 -m pepper.pepper polish --help
# Expected output: PEPPER VERSION:  0.1.1

Models

The model files are available here: https://github.com/kishwarshafin/pepper/tree/r0.1/models

MinION_r10_native_microbial.pkl : For R10.3 guppy 3.4.8 (Microbial)
MinION_r10_pcr_microbial.pkl : For R10.3 guppy 3.4.8 (Microbial)
PEPPER_polish_haploid_guppy360.pkl : Supports Guppy 3.0.5 to Guppy 4+ (Large genomes- trained to be sensitive to the heterozygosity of the genome, can be used in phase-aware polishing)
PromethION_r941_guppy305_HAC_human.pkl : Supports Guppy 3.0.5 to Guppy 4+ (Large genomes)
PromethION_r941_guppy305_HAC_microbial.pkl : Supports Guppy 3.0.5 to Guppy 4+ (Microbial)

Motivation

Assemblies generated using ONT data usually have low base-level quality and require further polishing. Existing polishers like Racon-Medaka can improve the base-level quality of an assembly but performs poorly in transcriptome completeness. Previously, we introduced a new polisher suite, MarginPolish-HELEN, with superior performance in transcriptome completeness and base-level accuracy. However, MarginPolish-HELEN has runtime and cost overhead. To overcome the issue, we developed PEPPER, where we use local realignment of reads to the assembly to produce highly accurate polished genome assemblies while being sensitive to the structural integrity of the assembly. PEPPER can be paired with Shasta, Flye, Canu or any other ONT based assemblers. The performance of PEPPER as a standalone assembly polisher is superior to any other existing ONT assembly polisher including MarginPolish-HELEN.

We participated in the HPRC assembly bakeoff where Shasta-PEPPER HG002 assembly was able to achieve Q35 in assembly quality while having similar transcriptome completeness to that reported in the Shasta-MarginPolish-HELEN paper.

Extension to variant calling

In collaboration with Google Health, we used a modified version of the haploid assembly polisher mode of PEPPER and paired it with DeepVariant to achieve state-of-the-art performance in reference based small variant calling with ONT reads. Our effort has been recognized by the PrecisionFDA truth challenge V2 where PEPPER-DeepVariant achieved top awards in ONT category. This work is still in development and future releases will include details about modules that we are developing to enable ONT-based variant calling.

Collaboration with Darwin tree of life project and other projects.

The Darwin Tree of Life project plans to sequence and assemble all known species of animals, plants, fungi and protists in Britain and Ireland. The project picked Shasta to generate de novo ONT assemblies efficiently and after evaluating multiple existing assembly polishers, the tree of life project picked PEPPER to polish the assemblies. We are collaborating with Ksenia Krasheninnikova from the Wellcome Sanger Institute, who is actively evaluating PEPPER on non-model vertebrate genomes and helping us to improve our methods.

We are also collaborating with several other groups to use PEPPER to polish ONT based genome assemblies. We have applied PEPPER to polish tomato genomes, non-human vertebrate genomes, highly heterozygous plant genomes and microbial genomes. In all cases, we saw better performance than existing polishing tools when it comes to structural integrity of the genome assembly and base-level quality.

Future direction

PEPPER builds a foundation upon which we plan to develop a set of next-generation genome inference tools for ONT reads. In collaboration with Google Health, we were able to use PEPPER as a primary candidate finder that enabled DeepVariant to identify variants from ONT reads accurately. We plan to keep improving the variant-calling pipeline. Moreover, Shasta is now producing haplotype-resolved genome assemblies, and we plan to deploy a diploid assembly polishing pipeline soon.