Skip to content

Co-dfns v5.0.0

Compare
Choose a tag to compare
@arcfide arcfide released this 08 Nov 17:03
· 189 commits to master since this release

After much work and labor pains, the newest version of Co-dfns is here! This is a huge release with a massive number of changes, so it's worth laying them out.

What's New

New runtime architecture

The biggest thing to come to Co-dfns in this release is that the entire runtime has been rewritten to follow a new architecture and to allow almost all of it to be written in APL. This means that now the runtime is a separately compiled and distributed object that you'll need to build and ship with your product. See the installation guide for more details on how to build the runtime.

Almost complete language support

With the exception of Key (), Format (), and Inverse (⍣¯1), we expect that all normal primitives are now fully supported. This includes fixing and removing a number of the limitations in the previous implementations of some primitives.

Removal of most system limitations

We have removed all known limitations on things like depth, rank, data type, and so forth. The only known remaining limitation is that indexing into device arrays that have a rank greater than 4 is not yet supported. We have complex and nested array support for all primitives for which it makes sense. We have added support for character arrays (8, 16, and 32-bit) as well as allowing for mixed arrays.

This also includes limitations on the use of user-defined operators within the source code, which can now take any type of operands.

See the Limitations section of the manual for more information on the limitations that are still present in the system.

Improved error handling and stack tracing

When an error occurs in a compiled module, the system will generate a complete stack trace together with line information both for the original source input as well as the corresponding compiled target language source line. This tracking occurs at the "token by token" level, meaning that you get a trace of the execution of an expression down to the individual application level, and not just the line level.

Furthermore, we have a much richer set of parsing errors that provides richer errors when parse-time errors are discovered.

Data type squeezing and handling of overflow

The system will now squeeze and promote arrays as appropriate. In order to maintain performance, we will automatically promote arrays to the largest safest type for a given computation. We will also only squeeze arrays down when it seems convenient to do so. This is meant to improve performance on GPU kernels while still giving the advantages of squeezing and promotion.

CPU and GPU execution

The C runtime will now execute small array computations on the CPU, and only migrate computations to the GPU when a given threshold is crossed. GPU migration is meant to be mostly "sticky," so computations should stay on the GPU after they have migrated unless they collapse back down into very small arrays (scalars).

Some added features

There are some new features which even Dyalog APL does not have:

  • We support any arbitrary rank, even greater than 15
  • Any arbitrary function can be modified with the axis operator. See rtm/prim.apln for some examples of how this is used and accessed.
  • There is a platform agnostic foreign function interface using the primitive, also demonstrated in the rtm/prim.apln code.
  • We have full support for real closures, which enables future dfns behaviors to be richer than those presently possible in the interpreter
     

What's changed

We have removed the old runtime API, since it is being redesigned and currently doesn't operate under the new runtime model. This includes the caching API.

The C calling conventions and behaviors are now different.

Unfortunately, the new runtime is not yet optimized for all workloads, and it is likely that you will see a slow down in some of your code in this version as we work to improve the performance of the system over subsequent releases.

Our implementation of stencil follows the "stencil function" format instead of the stencil operator that is included in Dyalog APL. Please take note of this difference.

We currently generate excessive amounts of code, which can make large namespaces somewhat heavy and long to compile. We are working to improve this.