Add AVX/AVX2 support #43

melsman · 2020-04-27T11:45:30Z

Add support for packed vector instructions for floating point and integer operations.

Design and implement a generic signature that supports various explicit operations (e.g., mul, add) on, for instance, 64-bit floating point values (in e.g., 256bit packed vector registers).
Design and implement various structures that matches the above signature (e.g., for packed 64-bit floats and for packed 64-bit integers). Make use of the MLKit prim feature for intrinsics.
Implement support for the intrinsics in the Compiler/Lambda/LambdaExp MLKit intermediate language to be targeted by the operations in the structures. Implement support for the operations all the way down to the Compiler/Backend/X64/CodeGenX64 / Compiler/Backend/X64/CodeGenUtilX64 modules (e.g., extend the operations in Compiler/Backend/PrimName.sml)
Implement operations for loading from and storing to memory. We can use the BlockF64 values for representing and allocating memory.

Discussion.

An important aspect here is that the implementation will have to include boxing-operations that implicitly box the vector values into memory. The optimiser can then eliminate box-unboxing and unbox-box compositions. The reason is that, in general, it is impossible to ensure that a value is not passed to a generic function, stored in a data structure, or captured in a closure; it is assumed that all values can be represented in one 64-bit word (perhaps tagged with the LSB being 1, if the GC should not traverse the value).

I foresee some issues with implementing support for register allocation on the ymm registers. Also, We must make sure that the optimiser (i.e., module Compiler/Lambda/OptLambda) does not pass wide 256-bit values to generic functions. Also, such values cannot be passed as arguments to functions and neither can they be stored in closures. They are solely for operations in basic blocks. Ideally, these restrictions could be enforced in Compiler/Lambda/LambdaStatSem.

An interesting application for these operations would be to make use of the operations to implement efficiently some of the operations in the Real64Array / Real64Vector structures.

References

The text was updated successfully, but these errors were encountered:

melsman added the enhancement label Apr 27, 2020

christiankjaer mentioned this issue Jan 29, 2021

[WIP] AVX2 support #56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AVX/AVX2 support #43

Add AVX/AVX2 support #43

melsman commented Apr 27, 2020 •

edited

Add AVX/AVX2 support #43

Add AVX/AVX2 support #43

Comments

melsman commented Apr 27, 2020 • edited

Discussion.

References

melsman commented Apr 27, 2020 •

edited