Skip to content

Releases: sourmash-bio/sourmash

v4.2.2

11 Aug 19:32
f7f2e83
Compare
Choose a tag to compare

Major new features:

  • added functionality to recover original k-mers given hashes - sourmash sig kmers et al. (#1653, #1695, #1701)

Documentation updates:

  • Updated picklist docs (#1683)
  • Updated the 'how to release' doc after 4.2.0 release (#1649)

Minor new features:

  • Adjusted dayhoff and hp encodings to tolerate stop codons in the protein sequence (#1673)

Bug fixes and performance improvements:

  • Fixed panic bug in sourmash sketch dna with bad input and --check-sequence (#1702)

Refactoring and cleanup:

  • Changed sourmash compute to sourmash sketch in tests/test_sourmash.py (#1680, #1687)
  • Tested and fixed sourmash_args.load_many_signatures(...) and lca_db.load_single_database (#1684)

v4.2.1

16 Jul 23:24
96920c7
Compare
Choose a tag to compare

This is a bug-fix and performance release of sourmash.

There are no major new features.

git log --oneline v4.2.0..latest

Minor new features:

  • new picklist coltypes for directly using gather, prefetch, and manifest outputs without specifying column name (#1660)
  • add --from-file to sig cat (#1657)
  • implement a lazy/on-demand Index loading class to support low memory tracking of a large index (#1661)
  • add sourmash tax prepare to build SQLite taxonomy databases for use with tax commands(#1651)
  • Support manifests in MultiIndex (#1654)
  • tax summarization additions and fixes, including reporting bp and unclassified (#1667)
  • add --from-file, improved sig selection to most sig commands (#1672)

Bug fixes and performance improvements:

  • fix bug in gather when run with scaled=1 (#1670)

Documentation updates:

  • Add sourmash-bio/community Gitter badge to README (#1658)

Refactoring and cleanup:

  • add tests for sourmash tax --containment-threshold arg (#1666)
  • fix sourmash tax usage string (#1655)
  • add bounds checking for --scaled (#1650)

Rust interface:

  • Rust Core update (tag: r0.11.0) (#1643)

v4.2.0

01 Jul 13:21
@ctb ctb
21f5e63
Compare
Choose a tag to compare

This release adds several significant features: first, we've added a set of taxonomy command-line functionality for combining sourmash gather output with taxonomy databases, and we've also added a new "picklist" feature that enables flexible selection of subsets of databases. Finally, we've added manifests to databases to support picklists as well as faster database loading and signature selection.

As of this release, we've also formally moved development over to the sourmash-bio organization on GitHub, and we've created a new gitter support channel, sourmash-bio/community. Please join us there if you have any questions, comments, or feature requests!

Major new features:

Documentation updates:

  • Add new GTDB databases description to docs and start legacy databases page (#1581)
  • Change dib-lab/ URLs to new sourmash-bio/ URLs. (#1629)
  • Add notice for sustainable open source study (#1580)

Minor new features:

  • alias --nucleotide, --no-nucleotide for moltype args. (#1632)
  • add signature names to known/unknown hash sigs output by sourmash prefetch (#1646)

Bug fixes and performance improvements:

  • Speed up sourmash gather with prefetch by ignoring unidentifiable hashes (#1613)
  • Check for MinHash compatibility in MinHash.intersection_and_union(...) (#1627)
  • Fix selection w/abund and manifest column type conversions (#1645)

Refactoring and cleanup:

  • Fix Rust 1.59 lints (#1600)
  • Minor cleanup in sourmash_args & sig submodules (#1586)
  • Minor cleanup in minhash module (#1585)
  • Fix needless borrows as suggested by clippy (#1636)

v4.1.2

07 Jun 21:27
@ctb ctb
6b5806c
Compare
Choose a tag to compare

This is a bug-fix and performance release of sourmash.

There are no major new features.

Minor new features:

  • add query info to gather CSV output (#1565)

Bug fixes and performance improvements:

  • Improved MinHash.remove_many(...) performance by five orders of magnitude (#1571)
  • Fix SBT index saving bug that arbitrarily replaced names (but not content) of identical signatures in .sbt.zip files (#1568)
  • Empty zipfiles should not cause AssertionError (#1546)

Major refactoring and new internal functionality:

  • update MinHash.set_abundances to remove hash if 0 abund; handle negative abundances (#1575)

Refactoring and cleanup:

  • Fix tests that fail to close files that they open (#1550)
  • Add "&" and " | " as alternate syntax for MinHash intersection merge (#1533)
  • Fix missing bracket in docs (#1566)
  • Updates for coverage tracking (#1558)
  • Provide a .copy() method for both SourmashSignature() and MinHash (#1551, #1570)

v4.1.1

22 May 14:06
e47bdb5
Compare
Choose a tag to compare

This release fixes a minor bug, provides some refactorings, and dramatically decreases memory consumption for sourmash gather --linear (which is, admittedly, a niche use case :).

No major new features.

Bug fixes and performance improvements:

  • Unload data with sourmash gather --linear on SBTs (#1534)
  • Fix sourmash gather --no-prefetch when used w/abund signatures (#1528)
  • Fix sourmash index to not create directory for .sbt.zip output (#1539)

Major refactoring and new internal functionality:

  • Add FrozenMinHash to better support separation of frozen and mutable data actions (#1508)

Refactoring and cleanup:

  • Improved error handling and testing for pathlist loading (#1469)
  • Updated some tests to use sourmash sketch instead of sourmash compute (#1536)
  • Refactor sourmash lca summarize to remove unnecessary if statements, improve tests (#1540)

v4.1.0

14 May 13:36
4c48f39
Compare
Choose a tag to compare

4.1.0 release notes

This release provides several convenient features for users, including zipfile collections on input and output and a new prefetch command. sourmash gather has also received a considerable speed/memory upgrade (twice as fast, 80-90% lower memory). You should upgrade! As a reminder, v4.x has several incompatibilities with v3.x, and if you are upgrading from v3.x you should consult our migration guide.

Major new features:

  • Support zipped collections of signatures (#1349)
  • Refactor gather functionality for speed & modularity (#1370, #1512, #1513)
  • Provide new command, prefetch. (#1370)
  • Add flexible & iterative support for outputting signatures in variety of collection formats - directories, zipfiles, etc. (#1493)
  • Add max_containment to API and --max-containment to command line (#1346)
  • Add --from-file option to sourmash sketch commands (#1362)

Bug fixes that break backwards compatibility:

  • Require scaled signatures for containment (#1381)
  • Fix CSV output for sourmash lca classify when .name is empty (#1401)
  • Really old SBTs (pre-v2.0) no longer load (v1 and v2 SBTs) (changed in #1392)

Other bug fixes:

  • Add proper newline output for csv module (#1319) - important for Windows!

Other new features:

  • --best-only searches now work for both similarity AND containment (fixed in #1392)
  • sourmash categorize now takes all database types
  • add --name to sourmash sig merge (#1480)
  • decline to load really large files for LCA databases if they're not valid JSON (#1495)

Major refactoring and new internal functionality:

  • Add a MultiIndex class that wraps multiple Index classes (#1374)
  • Refactor and dramatically simplify database loading and compatibility checking (#1406, #1420)
  • Rework the find functionality for Index classes (#1392, #1477).
  • Improved intersection and union calculations (#1475)

Documentation enhancements:

  • Update the sourmash __init__.py docstring, provide __all__ for imports (#1364)
  • Add '-h/--help' usage instructions to 'sourmash sketch' CLI (#1400)
  • Add ORCID to contribution checklist (#1405)
  • Add information about updating the developer environment to the developer docs (#1432)
  • Docs: Partial fix for doc build issues with notebooks (#1516)

Refactoring and cleanup:

  • Refactor the database loading code in sourmash_args (#1373, #1380)
  • Pin needletail version to keep MSRV at 1.37 (#1393)
  • Rename load_file_list_of_signatures to load_pathlist_from_file (#1423)
  • Update call to notify in src/sourmash/search.py with f-strings (#1422)
  • Bump MSRV to 1.42 (and other dep fixes) (#1461)
  • CI/Rust: update and fix cbindgen config (#1473)
  • Refactor MinHash.downsample (#1458)
  • Make MinHash.downsample(...) require keyword arguments & fix newly revealed buggy test. (#1448)
  • Add a check for LCA database error text intests/test_lca.py (#1445)
  • pin docutils version to last working (#1444)
  • add codecov configuration to fix paths (#1422, #1449)
  • provide new test fixtures for cleaner testing (#1487)
  • Fix small papercuts: SyntaxWarning and coverage reports (#1488)
  • Clean up clippy lints from 1.52 (#1505)
  • Bump docutils from 0.16 to 0.17.1 (#1499)
  • Update myst-parser requirement from ~=0.13.7 to >=0.13.7,<0.15.0 (#1520)
  • replace utils.TempDirectory with runtmp in some tests (#1502)

v4.0.0

02 Mar 19:39
4f43288
Compare
Choose a tag to compare

Major changes for 4.0

4.0 is a major new version of sourmash, and it contains a number of new and breaking features.

Please see our migration guide for more information on how to migrate from v3.x to version 4.0!

Numerical output and search results are unchanged

There are no changes to numerical output or search results in this release; you should get the same results with v4 as you get with v3, except where command-line parameters need to be adjusted as noted below (see: protein ksize #1277, lca summarize changes #1175, sourmash gather on signatures without abundance #1328). Please file an issue if your results change!

New or changed behavior

  • default SBT storage is now .sbt.zip (#1174, #1170)
  • add sourmash sketch command for creating signatures (#1159)
  • protein ksizes in MinHash are now divided by 3, except in sourmash compute (#1277)
  • refactor MinHash API and implementation: add, iadd, merge, hashes, and max_hash (#1282, #1154, #1139, #1301)
  • add HyperLogLog implementation (#1223)
  • SourmashSignature.name is now a property (not a method): use str(sig) instead of name() (#1179, #1232)
  • lca summarize no longer merges all signatures, and uses hash abundance by default (#1175)
  • index and lca index (#1186, #1222) now support --from-file and no longer require signature files on command line
  • --traverse-directory is now on by default for signature loading behavior (#1178)
  • sourmash sketch and sourmash compute no longer create empty signatures from empty files and stdin (#1347);
  • sourmash sketch and sourmash compute set sig.filename to empty string when filename is - (#1347);

Feature removal

  • remove Python 2.7 support (& end Python 2 compatibility) (#1145, #1144)
  • remove lca gather (#1307)
  • remove 10x support from sourmash compute (#1229)
  • remove 'dump' command (#1157)

Feature/function deprecations

  • deprecate sourmash compute (#1159)
  • deprecate load_signatures, sourmash.load_one_signature, create_sbt_index, and load_sbt_index (#1279, #1304)
  • deprecate import_csv in favor of new sourmash sig import --csv (#1281)

Refactoring, improvements, and minor bug fixes:

  • accept file list in sourmash sig cat (#1236)
  • add unique_intersect_bp and gather_result_rank to gather CSV output (#1219)
  • remove deprecated minhash functions (#1149)
  • fix Rust panic error in signature creation (#1172)
  • cache nodes in SBT during search (#1161)
  • fix two bugs in gather --output-unassigned (#1156)
  • Refactor the gather code so that it uses 'hashes' instead of 'mins' (#1329)
  • Update output from gather w/o abundances, so that abund output is empty instead of 0(#1328)

Documentation updates

  • substantial revisions and updates to the documentation (#1283)
  • add information about versioning, migrations, etc to the docs (#1153)

Infrastructure and CI changes:

  • update finch requirement from 0.3.0 to 0.4.1 (#1290)
  • update rand for test, and activate "js" feature for getrandom (#1275)
  • dev updates (configs and doc) (#1298)
  • move wheel building from Travis to GitHub Actions (#1295)
  • fix new clippy warnings from Rust 1.49 (#1267)
  • use tox for running tests locally (#696)
  • CI: small build fixes (#1252)
  • CI: Fix releases in GitHub Actions (#1250)
  • update build_wheel action paths
  • CI: moving python tests from travis to GH actions (#1249)
  • CI: move wheel building to GitHub actions (#1244)
  • remove last .rst file from docs (#1185)
  • update CI for latest branch name change (#1150)

v3.5.1

16 Feb 00:42
@ctb ctb
3bfd0fa
Compare
Choose a tag to compare

Feature deprecations

  • add deprecation warning for sourmash compute --input-is-10x (#1326)
  • add warnings about new sourmash lca summarize behavior (#1326)
  • add warning for new behavior of MinHash.merge(...) (#1326)
  • add deprecation warning for TarStorage (#1165)

Infrastructure and CI changes:

  • Backport github actions to stable branch (3.5.x) (#1317)

v3.5.0

11 Aug 19:27
@ctb ctb
111b46e
Compare
Choose a tag to compare

This is the first of several minor releases (v3.5.x) from the new stable branch. These releases focus on preparing for sourmash v4.0 by introducing deprecations and warnings for features that will be removed in v4.0.

Refactoring and deprecations:

  • MinHash class refactoring (#1128, #1129); many deprecations for 4.0 and 5.0
  • sourmash dump deprecated, for removal in 4.0 (#1147)
  • import sourmash_lib deprecated, for removal in 4.0 (#1143)

Cleanup:

  • remove mentions of ijson and khmer (no longer needed dependencies) #1140

Documentation:

  • Simplify and clean up README (#1124)
  • Add sourmash logo to docs and README (#1127)
  • update release process and release notes (#1125)

Rust:

  • Update typed-builder requirement from 0.6.0 to 0.7.0 (#1121)

v3.4.1

23 Jul 00:02
@ctb ctb
f8d0262
Compare
Choose a tag to compare

Major new features:

  • Document sourmash.fig usage and behavior; enable output of compare clustering with labels (#859)
  • Adds --majority option to lca classify using majority vote algorithm (#1113)

Minor improvements:

  • MinHash compatibility check to sourmash sig intersect (#1116)

Bugs fixed:

  • add ksize selectors back into sourmash sig functions (#1105)

Documentation updates:

  • Minor updates to release procedure (#1102)
  • Update DB links in docs (#1084)