Skip to content

24.0.0

Compare
Choose a tag to compare
@github-actions github-actions released this 22 Jan 23:25

These release notes are automatically extracted from the full changelog.

Major Changes

  • ancestral, translate: For VCF inputs please ensure you are using TreeTime 0.11.2 or later. A large number of bugfixes and improvements have been added in both Augur and TreeTime. #1355 and TreeTime #263 (@jameshadfield)
  • ancestral, translate: GenBank files now require the (GFF mandatory) source feature to be present. #1351 (@jameshadfield)
  • ancestral, translate: For GFF files, we extract the genome/sequence coordinates by inspecting the sequence-region pragma, region type and/or source type. This information is now required. #1351 (@jameshadfield)

Features

  • ancestral, translate: Improvements to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
    • Output VCF will better match the input VCF, including CHROM name and ploidy encoding.
    • VCF inputs now require --vcf-reference-output
    • AA sequences are now exported for the tree root
    • VCF writing is now 3 orders of magnitude faster (dataset dependent)
  • ancestral, translate: A range of improvements to how we parse GFF and GenBank reference files. #1351 (@jameshadfield)
    • translate will now always export a 'nuc' annotation in the output JSON, allowing it to pass validation
    • Gene/CDS names of 'nuc' are now forbidden.
    • If a Gene/CDS in the GFF/GenBank file is unparsed we now print a warning.
  • ancestral: For VCF alignments, a VCF output file is now only created when requested via --output-vcf. #1344 (@jameshadfield)
  • ancestral: Improvements to command line arguments. #1344 (@jameshadfield)
    • Incompatible arguments are now checked, especially related to VCF vs FASTA inputs.
    • --vcf-reference and --root-sequence are now mutually exclusive.
  • translate: Tree nodes are checked against the node-data JSON input to ensure sequences are present. #1348 (@jameshadfield)
  • utils::load_features: This function may now raise AugurError. #1351 (@jameshadfield)
  • export v2: Automatically minify large outputs. Use --no-minify-json to disable this default behavior. #1352 (@victorlin)
  • Added a new file DEPRECATED.md to document timelines and progress of deprecated features in the Augur CLI and Python API. #1371 (@victorlin)

Bug Fixes

  • ancestral, translate: Various fixes to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
    • Fix incorrect (but passing) tests
    • Fix case-sensitive sequence comparisons between the root and reference sequences.
    • Fix a bug where ambiguous alleles are not inferred (see #1380 for full details).
    • Fix a bug where positions with no sequence information were assigned a base because the mask was not being computed (see #1382 for full details).
    • More than one ALT allele is now correctly parsed
    • Mutations followed by an insertion are now parsed
    • Unchanged ref genotypes are now encoded as '0' rather than '.'
    • ALT alleles "*" are now valid (introduced in VCF spec 4.2, but observed in VCF 4.1 files)
    • Positions with no variation are no longer exported
  • ancestral, translate: Fixes for JSON (non-VCF) inputs. #1355 (@jameshadfield)
    • The "reference" translations are now from the provided reference sequence, not from the root of the tree. #1355 (@jameshadfield)
    • Fix a bug where positions with no sequence information were assigned a base because the mask was not applied (see #1382 for full details)
  • ancestral, translate: Avoid incompatibilities with Biopython >=1.82. #1374, #1387 (@victorlin)
  • ancestral, translate: Address Biopython deprecation warnings. #1379 (@victorlin)
  • ancestral: Previously, the help text for --genes falsely claimed that it could accept a file. Now, it can truly claim that. #1353 (@victorlin)
  • translate: The 'source' ID for GFF files is now ignored as a potential gene feature (it is still used for overall nuc coords). #1348 (@jameshadfield)
  • translate: Improvements to command line arguments. #1348 (@jameshadfield)
    • --tree and --ancestral-sequences are now required arguments.
    • separate VCF-only arguments into their own group
  • translate: Fixes a bug in the parsing behaviour of GFF files whereby the presence of the --genes command line argument would change how we read individual GFF lines. Issue #1349, PR #1351 (@jameshadfield)
  • If TreeTimeError is encountered Augur now exits with code 2 rather than 0. (This restores the original behaviour.) #1367 (@jameshadfield)
  • Deprecate read_strains from augur.utils and add it to the public API under augur.io. #1353 (@victorlin)