24.2.0
These release notes are automatically extracted from the full changelog.
Features
- filter: Added a new option
--query-columns
that allows specifying what columns are used in--query
along with the expected data types. If unspecified, automatic detection of columns and types is attempted. #1294 (@victorlin) augur.io.read_metadata
: A new optionalcolumns
argument allows specifying a subset of columns to load. The default behavior still loads all columns, so this is not a breaking change. #1294 (@victorlin)augur parse
: A new optional--output-id-field
argument allows the user to select any ID field for the produced FASTA file (e.g. 'accession' instead of 'name' or 'strain'). #1403 (@j23414)- When no
--output-id-field
is given and the data has bothname
andstrain
fields, continue to preferentially usename
overstrain
as the sequence ID field; but, throw a deprecation warning that the order will be switched to preferstrain
overname
in the future to be consistent with the rest of Augur. - Added entry to DEPRECATED.md.
- When no
- Compression should now be supported for all input and output files. Please open an issue if you find one that doesn't! #1381 (@victorlin)
Bug Fixes
- filter: In version 24.1.0, automatic conversion of boolean columns was accidentally removed. It has been restored with additional support for empty values evaluated as
None
. #1410 (@victorlin) - filter: The order of rows in
--output-metadata
and--output-strains
now reflects the order in the original--metadata
. #1294 (@victorlin) - filter, frequencies, refine: Performance improvements to reading the input metadata file. #1294 (@victorlin)
- For filter, this comes with increased writing times for
--output-metadata
and--output-strains
. However, net I/O speed still decreased during testing of this change.
- For filter, this comes with increased writing times for
- filter: Updated the help text of
--include
and--include-where
to explicitly state that this can add strains that are missing an entry from--sequences
. #1389 (@victorlin) - filter: Fixed the summary messages to properly reflect force-inclusion of strains that are missing an entry from
--sequences
. #1389 (@victorlin) - filter: Updated wording of summary messages. #1389 (@victorlin)
- Enforce UTF-8 encoding when reading and writing files. Improve error messages when a non-UTF-8 file is used. #1381 (@victorlin)