24.0.0
These release notes are automatically extracted from the full changelog.
Major Changes
- ancestral, translate: For VCF inputs please ensure you are using TreeTime 0.11.2 or later. A large number of bugfixes and improvements have been added in both Augur and TreeTime. #1355 and TreeTime #263 (@jameshadfield)
- ancestral, translate: GenBank files now require the (GFF mandatory) source feature to be present. #1351 (@jameshadfield)
- ancestral, translate: For GFF files, we extract the genome/sequence coordinates by inspecting the sequence-region pragma, region type and/or source type. This information is now required. #1351 (@jameshadfield)
Features
- ancestral, translate: Improvements to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
- Output VCF will better match the input VCF, including CHROM name and ploidy encoding.
- VCF inputs now require
--vcf-reference-output
- AA sequences are now exported for the tree root
- VCF writing is now 3 orders of magnitude faster (dataset dependent)
- ancestral, translate: A range of improvements to how we parse GFF and GenBank reference files. #1351 (@jameshadfield)
- translate will now always export a 'nuc' annotation in the output JSON, allowing it to pass validation
- Gene/CDS names of 'nuc' are now forbidden.
- If a Gene/CDS in the GFF/GenBank file is unparsed we now print a warning.
- ancestral: For VCF alignments, a VCF output file is now only created when requested via
--output-vcf
. #1344 (@jameshadfield) - ancestral: Improvements to command line arguments. #1344 (@jameshadfield)
- Incompatible arguments are now checked, especially related to VCF vs FASTA inputs.
--vcf-reference
and--root-sequence
are now mutually exclusive.
- translate: Tree nodes are checked against the node-data JSON input to ensure sequences are present. #1348 (@jameshadfield)
- utils::load_features: This function may now raise
AugurError
. #1351 (@jameshadfield) - export v2: Automatically minify large outputs. Use
--no-minify-json
to disable this default behavior. #1352 (@victorlin) - Added a new file DEPRECATED.md to document timelines and progress of deprecated features in the Augur CLI and Python API. #1371 (@victorlin)
Bug Fixes
- ancestral, translate: Various fixes to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
- Fix incorrect (but passing) tests
- Fix case-sensitive sequence comparisons between the root and reference sequences.
- Fix a bug where ambiguous alleles are not inferred (see #1380 for full details).
- Fix a bug where positions with no sequence information were assigned a base because the mask was not being computed (see #1382 for full details).
- More than one ALT allele is now correctly parsed
- Mutations followed by an insertion are now parsed
- Unchanged ref genotypes are now encoded as '0' rather than '.'
- ALT alleles "*" are now valid (introduced in VCF spec 4.2, but observed in VCF 4.1 files)
- Positions with no variation are no longer exported
- ancestral, translate: Fixes for JSON (non-VCF) inputs. #1355 (@jameshadfield)
- The "reference" translations are now from the provided reference sequence, not from the root of the tree. #1355 (@jameshadfield)
- Fix a bug where positions with no sequence information were assigned a base because the mask was not applied (see #1382 for full details)
- ancestral, translate: Avoid incompatibilities with Biopython >=1.82. #1374, #1387 (@victorlin)
- ancestral, translate: Address Biopython deprecation warnings. #1379 (@victorlin)
- ancestral: Previously, the help text for
--genes
falsely claimed that it could accept a file. Now, it can truly claim that. #1353 (@victorlin) - translate: The 'source' ID for GFF files is now ignored as a potential gene feature (it is still used for overall nuc coords). #1348 (@jameshadfield)
- translate: Improvements to command line arguments. #1348 (@jameshadfield)
--tree
and--ancestral-sequences
are now required arguments.- separate VCF-only arguments into their own group
- translate: Fixes a bug in the parsing behaviour of GFF files whereby the presence of the
--genes
command line argument would change how we read individual GFF lines. Issue #1349, PR #1351 (@jameshadfield) - If
TreeTimeError
is encountered Augur now exits with code 2 rather than 0. (This restores the original behaviour.) #1367 (@jameshadfield) - Deprecate
read_strains
fromaugur.utils
and add it to the public API underaugur.io
. #1353 (@victorlin)