Release 24.2.0 · nextstrain/augur

These release notes are automatically extracted from the full changelog.

filter: Added a new option --query-columns that allows specifying what columns are used in --query along with the expected data types. If unspecified, automatic detection of columns and types is attempted. #1294 (@victorlin)
augur.io.read_metadata: A new optional columns argument allows specifying a subset of columns to load. The default behavior still loads all columns, so this is not a breaking change. #1294 (@victorlin)
augur parse: A new optional --output-id-field argument allows the user to select any ID field for the produced FASTA file (e.g. 'accession' instead of 'name' or 'strain'). #1403 (@j23414)
- When no --output-id-field is given and the data has both name and strain fields, continue to preferentially use name over strain as the sequence ID field; but, throw a deprecation warning that the order will be switched to prefer strain over name in the future to be consistent with the rest of Augur.
- Added entry to DEPRECATED.md.
Compression should now be supported for all input and output files. Please open an issue if you find one that doesn't! #1381 (@victorlin)

filter: In version 24.1.0, automatic conversion of boolean columns was accidentally removed. It has been restored with additional support for empty values evaluated as None. #1410 (@victorlin)
filter: The order of rows in --output-metadata and --output-strains now reflects the order in the original --metadata. #1294 (@victorlin)
filter, frequencies, refine: Performance improvements to reading the input metadata file. #1294 (@victorlin)
- For filter, this comes with increased writing times for --output-metadata and --output-strains. However, net I/O speed still decreased during testing of this change.
filter: Updated the help text of --include and --include-where to explicitly state that this can add strains that are missing an entry from --sequences. #1389 (@victorlin)
filter: Fixed the summary messages to properly reflect force-inclusion of strains that are missing an entry from --sequences. #1389 (@victorlin)
filter: Updated wording of summary messages. #1389 (@victorlin)
Enforce UTF-8 encoding when reading and writing files. Improve error messages when a non-UTF-8 file is used. #1381 (@victorlin)

Provide feedback