Skip to content

Releases: NVIDIA-Genomics-Research/GenomeWorks

v2021.02.2

13 May 00:10
Compare
Choose a tag to compare

Updated tqdm requirements for pygenomeworks.

v2021.02.1

12 May 17:15
Compare
Choose a tag to compare

Updated numpy and cython requirements for pygenomeworks.

v2021.02.0

17 Mar 10:47
Compare
Choose a tag to compare
  • CUDA Mapper

    • Created a new interface to work with Indices. Allow grouping Indices together through public API.
    • Added support for Index from descriptor objects and use that throughout.
    • Added API support for outputting sequences/alignments in SAM/BAM format
    • Determining positions of reads' minimizers on device instead of on host
    • Correctly skipping reads that are too short to fit in at least one window
    • #557 and #562 correct errors in the evaluate_paf script that could result in incorrect counts of matched starts / ends of records or precision and recall values that were greater than 1.
  • CUDA Aligner

    • Fixed a bug in banded Myers which could lead to an out-of-bounds access for non-optimal alignments. If the backtrace in the Needleman-Wunsch matrix touched the border of the band at a specific point it may lead to an out-of-bounds access.
    • Added support for the extended CIGAR format, which distinguishes between matches = and mismatches X.
    • Improved performance for batches with very varying alignment lengths
    • Added a FixedBandAligner base class (as specialization of Aligner) for aligners that operate on a diagonal band. These aligners provide a aligner->reset_bandwidth(new_bandwidth) function now.
    • Fixed the memory requirements of Hirschberg-Myers aligner. It can now process significantly larger batches at once.
    • The default aligners returned by create_aligner() are the banded Myers aligner (a FixedBandAligner) for the create_aligner() function that does specify a bandwidth and Hirschberg-Myers aligner (a Aligner) for the create_aligner() call that does not specify a bandwidth. The API for the latter case is deprecated in will be replaced by a different create_aligner() function.
  • CUDA Partial Order Aligner (CUDA POA)

    • #551 Adds GFA output of the alignment graph generated by cudaPOA.
    • Applied various changes to optimize performance of kernels for banded alignments.
    • Added option -s for CUDA POA API to allow managing allocated memory for adaptive score matrix
    • Introduced static_band_traceback as a new alignment mode in CUDA POA. This mode can potentially improve performance for processing long-read batches.
    • Introduced adaptive_band_traceback for long-read batches. Different banded versions of Needleman-Wunsch kernels were unified.
    • Added new CI tests to validate results of static/adaptive-band and static/adaptive-band with traceback against full-band Needleman-Wunsch kernel.
    • Added description and hints to CUDA POA error codes.
    • Added support for caching device allocations to reduce time spent allocating device memory.
  • CUDA Extender

    • New Added new C++ module for CUDA-accelerated ungapped seed-extension algorithm that uses seed positions in encoded input strands to extend and compute the alignment between the strands, adapted from SegAlign's Ungapped Extender - S. Goenka, Y. Turakhia, B. Paten and M. Horowitz, "SegAlign: A Scalable GPU-Based Whole Genome Aligner," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA, US, 2020 pp. 540-552.
  • Pygenomeworks

    • CIGAR strings for pairwise alignments can now be visualized.
  • Other

    • System requirements are updated.
    • GenomeWorks semantic versioning is changed to calendar versioning.
    • DevicePreallocatedAllocator allocates exactly the amount of memory requested
    • Fixed silent execution errors, which could occur at GPU kernel launch under certain conditions.

v0.5.3

22 Sep 19:43
Compare
Choose a tag to compare

HOTFIX - Fixes to unblock GCC 9 and CUDA 11 beta support.

v0.5.2

04 Sep 17:07
Compare
Choose a tag to compare

HOTFIX - Fixed an integer overflow in cudaaligner.

v0.5.1

19 Aug 15:29
Compare
Choose a tag to compare

HOTFIX - Updated CMake structure to enable better integration with external tools.

GenomeWorks v0.5.0

05 Aug 17:28
Compare
Choose a tag to compare

Release v0.5.0 brings major performance and functionality updates to all modules with a focus on improved handling of long read sequences.

  1. Clara Genomics Analysis -> GenomeWorks
  • As of this release, the repository name and associated project and package names have been updated to GenomeWorks. The python bindings package is now available on PyPI as genomeworks.
  1. CUDA Partial Order Aligner (CUDA POA)
  • Novel adaptive banding implementation for partial order alignment achieves measurably better accuracy than default static band parameters with marginal drop in performance.
  • Support for consensus and MSA of long read sequences in all accuracy modes (full matrix, adaptive band and static band).
  • Several bug fixes and general stability improvements.
  • Backwards incompatible API changes.
  1. CUDA Aligner
  • New banded Myers algorithm with compact memory footprint and adjustable band size accelerates cudaaligner performance of long read global alignment (10-15kb sequences) by ~3x over previous implementations with comparable accuracy. Narrow bands may lead to non-optimal alignments.
  • Updated Alignment object provides edit distance for each alignment and a new flag to signal optimal vs non-optimal alignments.
  • Bug fixes and improved test coverage.
  • Backward incompatible API changes.
  1. CUDA Mapper
  • New conditions for fusing overlaps and changes to default parameters improve accuracy on small genomes without genomic repeats (NG50, mismatches, and indel accuracy matching minimap2 for E. coli and S. aureus).
  • Saving copies of indices in host memory and transferring them to device memory on demand avoids both additional index generations and leaves sufficient device memory for matcher and overlapper
  • Using CUB-based search algorithm gets better performance than Thrust because it leverages additional information about the data that is to be sorted.
  • New sample showcasing the use of cudamapper APIs to build a GPU-accelerated, minimizer based mapper.

ClaraGenomicsAnalysis Release 0.4.4

26 Mar 21:17
0cb8890
Compare
Choose a tag to compare
  • Bug fix : Updated cmake default installation path mechanism

ClaraGenomicsAnalysis 0.5.0 - Release Candidate 1

04 Mar 21:46
8adea70
Compare
Choose a tag to compare

Release candidate for v0.5.0 focuses primarily on improving the performance of the GPU accelerated mapper, cudamapper . Here are some highlights of this release so far -

  1. CUDA Mapper
  • Enabled filtering of high frequency sketch elements in index (-F option. More details highlighted in help message)
  • Host and device caching for improved indexer performance (-C and -c)
  • Smart cached memory allocator for improved device memory initialization time
  • Re-architected matcher/overlapper components to improve end to end performance
  • Algorithmic improvements to overlapper leader to higher accuracy of overlaps
Application dataset GPU v0.4.3 v0.5.0-rc1 Acceleration
cudamapper all-vs-all ONT E. Coli 150x 1x GV100 32GB 72.3s 23.2s 3.11x

ClaraGenomicsAnalysis Release 0.4.3

02 Dec 17:12
97b0b70
Compare
Choose a tag to compare

Release 0.4.3 is a hotfix for wheel package generation in Puyclaragenomics.

  • Python 3.5 was Added to PyPI classifiers.
  • Fixed liblogging.so linkage error occurred in Python 3.5.