Skip to content

Releases: cyclops-community/ctf

Improved intermediates for expression evaluation

11 May 11:45
Compare
Choose a tag to compare

lower order intermediates for tensor contraction chains

Version 1.33: minor fixes, some improving sequential performance

26 Apr 16:02
Compare
Choose a tag to compare

minor changes with respect to previous version, improvements to sequential performance by avoiding extra copy, corrections to GPU support code

Version 1.32: fixes and improvements to user-specified functions/transforms

16 Dec 22:23
Compare
Choose a tag to compare

Adjusted functionality and correctness of behavior for Functions and Transforms. Arbitrary-type support and multi-type functions are now more mature. For explanation of usage see the paper,

http://arxiv.org/abs/1512.00066

as well as shortest paths codes in examples/sssp.cxx and examples/apsp.cxx as well as a more complicated betweenness centrality code, examples/btwn_central.cxx.

Sparse CTF

06 Jul 09:32
Compare
Choose a tag to compare

Added support for sparse tensors and extended elementwise function interface to support C++11 Lambdas.

Also made some incremental changes to performance models.

It is now possible to sum sparse tensors and contract a sparse tensor with a dense tensor.

Contractions between two sparse tensors are not yet supported.

Version 1.31: Sparse CTF + CUDA support + performance model improvements

09 Nov 13:23
Compare
Choose a tag to compare

New features/updates:

  1. Renewed support for CUDA-based offloading
  2. New macro -DTUNE, activates model tuning, which adjusts performance model parameters. A benchmark aimed to execute a characteristic training set is included in bench/model_trainer and can be used to tune models for any architecture. Output at the end of the benchmark should be pasted into src/shared/init_models.cxx. It is not advisable to always run with -DTUNE on. Current parameters are based on a 16 node Edison runs with 4 processes per node and should be reasonable in most settings. Cubic polynomial model also available, but not used due to inferior observed modelling quality.
  3. Added a mapping search that exhaustively tries all possible mappings. As no effective benefit was observed from this expanded search space and the search itself slowed time down for low-cost contractions (e.g. in CCSDT), this mapping search is only done when the time estimated by the old scheme is longer than 1 second.
  4. Fixed up configure file and generated config.mk file to be more effective and better documented.