Skip to content

Latest commit

 

History

History
53 lines (46 loc) · 4.04 KB

Oxford_nanopore_r9_whole_genome.md

File metadata and controls

53 lines (46 loc) · 4.04 KB

ONT R9.4.1 Guppy 5.0.7 "Sup" HG003 whole genome performance evaluation

We hold out HG003 sample while training PEPPER-Margin-DeepVariant so we use HG003 to demonstrate our whole genome performance. We report both runtime and accuracy for this evaluation between PEPPER-Margin-DeepVariant r0.7 and r0.8.

Setup

We used the following dataset:

Sample:                   HG003 (Whole genome)
Coverage:                 ~85x
Chemistry:                R9.4.1
Basecaller:               Guppy 5.0.7 "Sup"

Downsampling the alignment file:

We downsampled the ~85x variant calling data using the following command:

samtools view -s 0.71 -b -@${THREADS} HG003_guppy_507_2_GRCh38_pass.bam > HG003_guppy_507_2_GRCh38_pass.60x.bam
samtools view -s 0.36 -b -@${THREADS} HG003_guppy_507_2_GRCh38_pass.bam > HG003_guppy_507_2_GRCh38_pass.30x.bam

Calling variants with PEPPER-Margin-DeepVariant (PEPPER r0.8)

time docker run -it -v /data:/data \
-u `id -u`:`id -g` \
kishwars/pepper_deepvariant:r0.8 \
run_pepper_margin_deepvariant call_variant \
-b $BAM \
-f $REF \
-o $OUTPUT_DIR \
-t $THREADS \
-s HG003 \
--ont_r9_guppy5_sup 2>&1 | tee $LOG_FILE

Results

In all stratified coverages (30x, 60x, 85x), PEPPER-Margin-DeepVariant r0.8 shows increased accuracy:

PEPPER performance whole genome

HG003 30x performance:

SampleVersionTypeTruth
total
True
positives
False
negatives
False
positives
RecallPrecisionF1-Score
HG003 30xr0.7INDEL504501317621186880350840.6295750.9027140.7418
SNP3327495331000217493119860.9947430.9963930.995567
r0.8INDEL504501345384159117518420.6846050.8724810.767209
SNP332749533090381845791730.9944530.9972360.995843

HG003 60x performance:

SampleVersionTypeTruth
total
True
positives
False
negatives
False
positives
RecallPrecisionF1-Score
HG003 60xr0.7INDEL504501366144138357334840.7257550.918270.810741
SNP332749533174921000385480.9969940.997430.997212
r0.8INDEL504501394987109514446780.7829260.900910.837785
SNP33274953317515998071200.9970010.9978590.99743

HG003 85x performance:

SampleVersionTypeTruth
total
True
positives
False
negatives
False
positives
RecallPrecisionF1-Score
HG003 85xr0.7INDEL504501383384121117305950.7599270.9279820.835588
SNP33274953318437905880320.9972780.9975860.997432
r0.8INDEL50450141216992332386330.8169840.916510.86389
SNP33274953318308918767330.9972390.9979760.997607