Skip to content

Releases: cyclops-community/ctf

Cyclops v1.5.5

10 Aug 20:05
Compare
Choose a tag to compare

Release includes new functionality:

  • search for best tree contraction ordering
  • symmetric eigensolve interface to ScaLAPACK
  • new python-level benchmarks for TTM/TTTP/MTTKRP

but mostly numerous bug fixes:

  • extensive corrections to CCSR (hypersparse formats)
  • important corrections to mapping logic for expensive large-scale tensor contractions
  • adjustments to performance models, fix to read in CTF_MODEL_FILE
  • use of mallinfo for determination of actual amount of overall used memory (turned of with -DNOMALLINFO)

Cyclops v1.5.4

24 Apr 23:55
Compare
Choose a tag to compare

Release includes bug fixes and a number of major functionality extensions:

  • CCSR layout (CSR with compression of row counts to nonzero rows) is now supported for sparse=sparse*dense contractions
  • TTTP (tensor times tensor product) is now supported, which enables efficient handling of contractions like V["ijk"] = U["ijk"]*A["jr"]*B["kr"], to arbitrary order with operands like A and B for any selection of modes in U
  • Cholesky and triangular solve are now directly interfaced to ScaLAPACK, also available on python layer
  • reshape functionality is now available at C++ level
  • sparse tensor I/O is now available
  • a randomized SVD routine has been added to C++ and Python level
  • option to threshold singular values has been added to regular SVD
  • SVD can now be called explicitly on tensors, with modes selected for the left and right singular vectors via strings
  • a light-weight contraction execution time model has been added
  • contraction of a sequence of tensors is now done in the best linear ordering for up to 8 tensors (this will be improved in the near future to the best tree ordering)
  • depending on the predicted sparsity of the output, sparse intermediates are now automatically defined and used within a sequence of contractions, including for sparse*dense contractions where the sparse tensor has fewer nonzeros than rows in the corresponding matrix unfolding
  • various bug fixed have been made to the python interface, in particular, sparsity is now maintained and scaling is handled more properly

Cyclops v1.5.3

25 Nov 23:18
Compare
Choose a tag to compare

Release focuses on bug fixes and minor functionality extensions

  • better integration and support of ScaLAPACK routines
  • bug fixes to build systems
  • bug fixes to int64_t overflows for very large processor counts
  • bug fixes to Hadamard products for sparse tensors

Cyclops v1.5.2

04 Jun 21:29
Compare
Choose a tag to compare

This release introduces new functionality and interface changes. The changes are consequential to performance, especially for sparse tensors and non-standard element types.

  1. data is now stored, accepted, and returned as an array of objects (i.e. created by new rather than malloc), read_local remains as before (memory should be released by free), but new routines have been introduced: get_local_data and get_local_pairs, which return data that should be released with delete
  2. as consequence of (1), more general object types are now supported for sequential execution
  3. via (2) it is now possible to leverage block sparsity, by defining a sparse tensor on MPI_COMM_SELF whose elements are dense tensors on MPI_COMM_WORLD (or any desired communicator), which yields bulk-synchronous execution of each block contraction and a memory-efficient block-sparse layout (see examples/block_sparse.cxx)
  4. QR and SVD are now available in C++ and Python, supporting real/complex in single/double precision
  5. the build system has been improved to fix bugs and improve robustness

Cyclops v1.5.1

08 Jan 22:11
Compare
Choose a tag to compare

Bug-fixing and functionality expansion release for Cyclops v1.5. Major fixes include the following,

  1. slice and permute corrected at C++ level for overwriting existing data with zeros as appropriate.
  2. __setitem__ and __getitem__ now work correctly consistently with numpy for a number of cases and permit strided slices.
  3. type conversions automatically handled in python
  4. test suite for python has been significantly extended
  5. print function corrected for writing data to file as opposed to stdout
  6. symmetry handled correctly at C++ and python level for slicing, although resulting slice is always declared nonsymmetric
  7. dot function for python corrected
  8. universal functions in python implemented and working

Cyclops v1.5.0

29 Dec 05:27
Compare
Choose a tag to compare

The v1.5.0 release includes the following developments:

  • Python support
    • via Cython, also requires numpy
    • ctf.tensor interface follows numpy.ndarray
    • see updated github page and docs for functionality supported
  • ScaLAPACK interoperability
    • extension and fixes for conversion functions between ScaLAPACK matrices and cyclops matrices
    • simple interface to ScaLaPACK SVD for cyclops matrices
    • automatic ScaLAPACK build via --build-scalapack flag
  • HPTT support
    • tensor transposition performance improved drastically when building with the high performance tensor transpose (HPTT) library developed by Paul Springer
    • tensor transposition semantics changed a bit for contractions
    • HPTT can be automatically build with configure flag --build-hptt
  • Hadamard products and batched BLAS
    • pure Hadamard products like c["ij"]+=a["ij"]*b["ij"]; recognized and done by simple loops
    • batched BLAS used when possible, including transposition to appropriate ordering of tensor modes
    • fast batched BLAS routines supported via MKL library

Version 1.4.2

14 Oct 14:22
Compare
Choose a tag to compare

various bug fixes, support for input/output from block-cyclic (ScaLAPACK descriptor) layouts, reorganization of examples and addition of new ones (spectral element, algebraic multigrid, correct FFT)

CTF v1.4.1: major bug fixes to memory accounting in sparse contractio…

02 Aug 08:43
Compare
Choose a tag to compare

…ns and construction of tensors with predefined processor grid mapping

Substantially extended sparse contraction capabilities

02 Aug 08:39
Compare
Choose a tag to compare

It is now possible to contract two sparse tensors, into a potentially sparse output. The release also includes bug fixes, improvements to start-up time at larger parallel scale, capability for writing tensors to disk via MPI_I/O, and improvements for multitype contraction/summation support. Semantics of Transform with sparse output changed for summations to only modify existing sparse elements rather than create new nonzeros.

Strong and weak scalability plots of this version for the sparse MP3 code in examples/sparse_mp3.cxx are below
spmp3_ss_edison_jul2016.pdf
spmp3_ws_edison_jul2016.pdf

v1.3.5

06 Jul 09:34
Compare
Choose a tag to compare

Eliminated a bug due to which CTF would incorrectly infer it was leaking memory when it was not, and crashing in extended runs due to thinking it was out of memory.

(previously v1.35)