Skip to content

Releases: epi2me-labs/wf-human-variation

v2.2.0

04 May 08:14
Compare
Choose a tag to compare

Added

  • Output {{sample}}.stats.json file describing some key metrics for the analysis.
  • Summary of gene coverage if a 4-column BED is provided.
  • Automated sex determination using relative coverage of chrX and chrY.
  • Retry strategy added to snp:aggregate_pileup_variants to prevent out of memory error.

Changed

  • --GVCF --phased will produce a phased GVCF.
  • Changed default phasing algorithm to whatshap, with the possibility to change the phasing to longphase with --use_longphase true.
    • The intermediate phasing is still performed using longphase.
  • Setting --snp --sv --phased will emit individually phased SNPs and SVs.
  • Phased bedMethyl files now follow the pattern {{ alias }}.wf_mods.{{ haplotype }}.bedmethyl.gz.
  • --sex parameter uses XX and XY rather than "female" and "male".
  • Update modkit to v0.2.6.
  • Improved modkit runtime by increasing default threads and increasing the default interval size.
  • Improved modkit runtime by increasing the default interval size and running modkit on individual contigs.
  • modkit is now run only on chromosomes 1-22, X, Y and MT, unless --include_all_ctgs is provided.
  • Increased minimum CPU requirement for the workflow to 16.
  • Filtering of SVs using a BED file now includes sites only partially overlapping the specified regions.
  • basecaller_cfg will be inferred from the basecall_model DS key of input read groups, if available
    • Providing --basecaller_cfg will not be required if basecall_model is present in the DS tag of the read groups of the input BAM
    • --basecaller_cfg will be ignored if a basecall_model is found in the input BAM
  • Reconciled workflow with wf-template v5.1.2
  • Update to Clair3 v1.0.8.
  • Update to longphase v1.7.1.

Fixed

  • Update schema to allow selection of multiple BAM files per sample in EPI2ME.
  • Spectre CNV report not handling cases when no CNVs detected.
  • Lines denoting normal maximum and pathogenic minimum thresholds now correctly displayed on STR repeat content plots.
  • Workflow will not emit sample.cram if sample.haplotagged.cram has been created by the workflow to save storage.
  • Emitting nonsense input.1 file

Removed

  • Single-step joint phasing of SV and SNP.
  • --output_separate_phased as the workflow emits only individually phased VCFs.
  • A copy of the reference and the generated reference cache is no longer output by the workflow.
    • The workflow encourages use of readily available standard reference sequences, so re-emitting the input reference as a workflow output is unnecessarily consuming disk space.

v2.1.0

27 Mar 18:54
Compare
Choose a tag to compare

Changed

  • ClinVar version in SnpEff container updated to version 20240307.
  • Convert to BAM only when --cnv --use_qdnaseq is selected.
  • Update to Clair3 v1.0.6.
  • Update Spectre to fix an error when parsing Clair3 VCFs with multiple AFs.
  • Support for an input folder of multiple BAM files per sample with --bam (instead of only allowing a single BAM per sample).
  • refine_with_sv to be run by chromosome in order to reduce memory footprint.

Fixed

  • Force minimap2 to clean up memory more aggressively. Empirically this reduces peak-memory use over the course of execution.
  • Handling of input VCF files with --vcf_fn.
  • --phased --sv --snp generates a truncated VCF file when # appears in the VCF INFO field
  • Some reporting scripts using too much memory.

Removed

  • CRAM as supported input format.
  • old_ref parameter as providing the reference of an existing CRAM is no longer needed.

v2.0.0

06 Mar 18:37
Compare
Choose a tag to compare

Changed

  • CNV calling with --cnv is now performed using Spectre, which is optimised for long reads.
    • Legacy CNV calling using QDNAseq may still be carried out with --cnv --use_qdnaseq.
    • The bin size parameter has been renamed from --bin_size to --qdnaseq_bin_size.
  • Skip CNV CRAM to BAM conversion if downsampling is required, to avoid creating an unnecessary intermediate file.
  • The output of --depth_intervals now has .bedgraph.gz extension.
  • SV workflow outputs SVs in the autosomes, sex chromosomes and MT; use --include_all_ctgs to output calls on all the sequences.

Added

  • Output definitions for coverage files.
  • N50 and mean coverage added to alignment report.

Fixed

  • EPI2ME Desktop incorrectly allowed selection of directory for tr_bed.
  • failedQCReport failing to generate a report.

v1.11.0

12 Feb 22:36
Compare
Choose a tag to compare

Changed

  • Add an additional whatshap haplotag process after the final VCF phasing.
  • Updates to the phasing subworkflow significantly impact the runtime and storage requirement for the workflow, as detailed here.
  • Several performance improvements which should noticeably reduce the running time of the workflow

Fixed

  • Updated the version of Straglr, which addresses the following:
    • Repeats can now be called in RFC1
    • Start position of called STRs is 1-based rather than 0-based
    • VCF headers now match those in the FORMAT field
  • Generate allChromosomes.bed using samtools faidx index instead of pyfaidx, to avoid a KeyError
  • Inconsistent file ownership of bundled Clair3 model files which could lead to subuid errors in some environments

v1.10.1

24 Jan 19:39
Compare
Choose a tag to compare

Fixed

  • Bug report form.

v1.10.0

24 Jan 00:53
Compare
Choose a tag to compare

Added

  • Clair3 4.3.0 models.

Changed

  • --phase_vcf, --joint_phasing and --phase_mod are now deprecated for --phased; see the README for more details.
  • --use_longphase_intermediate is now deprecated, and --use_longphase false will use whatshap throughout consistently
  • Running --phase --snp --use_longphase false will now phase indels too
  • --basecalling_cfg currently provides the configuration to Clair3.
  • The clair3: prefix to Clair3 specific models is no longer required.

Fixed

  • CNV report generation fails if there is no consensus on the copy number of a chromosome
    • Undetermined category has been added to the Chromosome Copy Summary to account for these cases
  • readStats reports metrics on the downsampled BAM when --downsample_coverage is requested.
  • Spurious warning for missing MM/ML tags when a BAM fails the coverage threshold

Removed

  • wf-basecalling subworkflow
    • fast5_dir input and other basecalling related options have been removed from the workflow parameters
    • Users should run the standalone wf-basecalling workflow and provide the output to wf-human-variation
  • Mapula statistics with --mapula

v1.9.2

13 Dec 16:46
Compare
Choose a tag to compare

Fixed

  • --joint_phasing generating single-chromosome VCF files.

v1.9.1

06 Dec 21:11
Compare
Choose a tag to compare

Changed

  • ClinVar annotation of SVs has been temporarily removed due to not being correctly incorporated. SnpEff annotations are still produced as part of the final SV VCF.
  • New documentation

Removed

  • --annotation_threads parameter, as the SnpEff process does not support multithreading.

Fixed

  • Truncated SV VCF header generated from vcfsort.
  • sed crashing with I/O error in some instances.
  • Missing flagstats file in output directory.

v1.9.0

17 Nov 12:54
Compare
Choose a tag to compare

Added

  • STR workflow report now includes additional plots which display repeat units and interruptions in each supporting read
  • CNV workflow now outputs an indexed VCF file to the output directory

Changed

  • Legend symbols in STR genotpying plot
  • Unambiguous naming of bedMethyl files generated with --mod
    • Unphased outputs will have the pattern [sample_name].wf_mods.bedmethyl.gz
    • Phased outputs will have the pattern [sample_name]_[1|2|ungrouped].wf_mods.bedmethyl.gz

Fixed

  • Report step failing if bcftools stats file has only some sub-sections
  • Clair3 ignoring the bed file
  • merge_haplotagged_contigs incorrectly generating intermediate CRAM when input is BAM
  • STR content generation failing due to forward slash in disease name in variant_catalog_hg38.json
  • Report name for the read alignment statistics now follows the pattern [sample_name].wf-human-alignment-report.html

v1.8.3

05 Oct 14:28
Compare
Choose a tag to compare

Fixed

  • configure-jbrowse breaking on unescaped spaces