add complex.hlsl #592

nahiim · 2023-11-03T13:44:54Z

Description

Start drafting complex.hlsl

devshgraphicsprogrammingjenkins · 2023-11-03T13:46:03Z

[CI]: Can one of the admins verify this patch?

include/nbl/builtin/hlsl/complex.hlsl

devshgraphicsprogramming · 2023-12-16T10:29:30Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+// Fast Fourier Transform
+template<typename T, uint32_t N>
+void fft_radix2(complex::complex_t<T> data[N], bool is_inverse)


make the is_inverse a template parameter

assert(N<gl_SubgroupSize)

btw the data need not be data[N], each invocation comes with its own register value and you can cooperatively exchange via the use of subgroupShuffleXor (add it to https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/include/nbl/builtin/hlsl/glsl_compat/subgroup_shuffle.hlsl please).

I mean the original GLSL code is here
https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/include/nbl/builtin/glsl/subgroup/fft.glsl

devshgraphicsprogramming · 2023-12-16T10:30:22Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+    // Cooley-Tukey
+    for (uint32_t stride = 2; stride <= N; stride *= 2)
+    {
+        complex::complex_t<T> twiddle_factor = { cos((is_inverse ? 2.0 : -2.0) * 3.14 / stride), sin((is_inverse ? 2.0 : -2.0) * 3.14 / stride) };


use the proper full FP64 value of PI

btwa nice piece of trivia @Fletterio , only GLSL defines fp64 transcendentals, HLSL doesn't (the sin,cos etc. functions always return float32, and if you try to use the GL.450.std extended instruction set to compute cos(double) this will end up emitting Float64 capability which is a definite DO NOT WANT)

So these definitely need to be tabulated as very precise constants and C-casted into the correct nbl::hlsl::scalar_type_t<T> so we don't accidentally emit Float64 capability

I'm not really following - we want to use the full PI value in f64, but not emit f64 capability?

yes you want to cast the constant into a type you'd be using, see the Gauss Legendre / Quadrature header stuff that Erfan is using for example 62 CAD

nvm about the tabulation... its not a perf win compared to just calling sin and cos

for now just assert that T for the FFT complex numbers is NOT float64_t because that opens a whole can of worms of std.glsl spir-v intrinsic header

or if you feel like adding sin and cos in doubles, do it here:

Nabla/include/nbl/builtin/hlsl/spirv_intrinsics/core.hlsl

Line 94 in 6b5f3d8

[[vk::ext_instruction(GLSLstd450MatrixInverse)]]

include/nbl/builtin/hlsl/glsl_compat/subgroup_shuffle.hlsl

include/nbl/builtin/hlsl/complex.hlsl

devshgraphicsprogramming · 2024-05-14T18:29:52Z

I see a certain ladder of features, and each should have their own branch (so I can merge in order without having all comments addressed):

having a complex_t struct to use
subgroup::fft utilities for DIT and DIF
workgroup::fft utilities for DIT and DIF
a way to do multiple input items per-workgroup, i.e. do a 2048 sized FFT with a 512 workgroup while having the option of only using enough shared memory for 512 FFT + auxilary accessor
optimizations for FFT real signals
resurrect the nbl::ext and 1 or both of the old FFT examples to test with

They can branch off each other.

P.S. Let me know if you need "runtime" transcendental/special functions for complex_t then we can look into doing up our tgmath.hlsl

Missing compound operators and template specializations for integer types. Arithmetic logic is off, different from the STL

include/nbl/builtin/hlsl/complex.hlsl

devshgraphicsprogramming · 2024-05-30T21:53:43Z

include/nbl/builtin/hlsl/complex.hlsl

+    template<typename Scalar>
+    struct complex_t
+    {


dont indent extra because of namespace

devshgraphicsprogramming · 2024-05-31T05:55:33Z

include/nbl/builtin/hlsl/binops.hlsl

+#define COMPOUND_ASSIGN(NAME,OP) template<typename T> struct assign_ ## NAME ## _t { \
+    void operator()(NBL_REF_ARG(T) lhs, NBL_CONST_REF_ARG(T) rhs) \
+    { \
+        lhs = lhs OP rhs; \
+    } \
+}
+COMPOUND_ASSIGN(add,+);
+COMPOUND_ASSIGN(subtract,-);
+COMPOUND_ASSIGN(mul,*);
+COMPOUND_ASSIGN(div,/);
+
+#undef COMPOUND_ASSIGN


add them here

Nabla/include/nbl/builtin/hlsl/functional.hlsl

Line 67 in 6633295

and make sure they have names plus_assign, multiplies_assign to be inline with the rest.

Also because its just 4 struct defs and not specialization, IMHO you shouldn't use macros because that just makes the code iffy to read. Or if you are to use a macro make sure it has the same form as the ALIAS_STD so you have to put th e }; afterwards to be able to add extra fields (i.e. identity)

devshgraphicsprogramming · 2024-05-31T06:22:02Z

include/nbl/builtin/hlsl/complex.hlsl

+        static complex_t create(const complex_t other)
+        {
+                return other;
+        }


no needed, regular assignment will do

devshgraphicsprogramming · 2024-05-31T06:25:06Z

include/nbl/builtin/hlsl/complex.hlsl

+                return !(uint64_t(m_real) ^ uint64_t(rhs.m_real)) && !(uint64_t(m_imag) ^ uint64_t(rhs.m_imag));
+        }
+        bool operator!=(const complex_t rhs)
+        {
+                return uint64_t(m_real) ^ uint64_t(rhs.m_real) || uint64_t(m_imag) ^ uint64_t(rhs.m_imag);


don't cast to uint64_t :

you're not comparing bitpatterns, you're casting to uint64_T which truncatest the floats, etc.

Scalar=float32_t would have been 32bit

!= and == work on floats, so use them

devshgraphicsprogramming · 2024-05-31T06:25:47Z

include/nbl/builtin/hlsl/complex.hlsl

+    // ---------------------- Compound assignment functors ----------------------
+    template <typename Scalar>
+    struct ComplexCompoundHelper{              
+        using add = assign_add_t<complex_t<Scalar> >;
+        using subtract = assign_subtract_t<complex_t<Scalar> >;
+        using mul = assign_mul_t<complex_t<Scalar> >;
+        using div = assign_div_t<complex_t<Scalar> >;
+    };
+
+    template <typename Scalar>
+    using assign_add_complex = typename ComplexCompoundHelper<Scalar>::add;
+    template <typename Scalar>
+    using assign_subtract_complex = typename ComplexCompoundHelper<Scalar>::subtract;
+    template <typename Scalar>
+    using assign_mul_complex = typename ComplexCompoundHelper<Scalar>::mul;
+    template <typename Scalar>
+    using assign_div_complex = typename ComplexCompoundHelper<Scalar>::div;


I don't get this, we don't have them for regular plus<T>

devshgraphicsprogramming · 2024-05-31T06:33:04Z

include/nbl/builtin/hlsl/fft/common.hlsl

+{
+namespace hlsl
+{
+namespace fft::common {


HLSL doesn't support nested namespace syntax

use detail or impl for things you don't want to make public to others

devshgraphicsprogramming · 2024-05-31T06:33:51Z

include/nbl/builtin/hlsl/fft/common.hlsl

+    template<typename Scalar, bool inverse> 
+    struct DIX {
+        //Assumes I have precomputed twiddles, so just pass the index. 
+        static void radix2Butterfly(NBL_CONST_REF_ARG(complex_t<Scalar>) twiddle, NBL_REF_ARG(complex_t<Scalar>) lo, NBL_REF_ARG(complex_t<Scalar>) hi){


drop Butterfly from the name, just radix2 is fine

include/nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl

devshgraphicsprogramming · 2024-05-31T11:21:10Z

include/nbl/builtin/hlsl/fft/common/twiddles.hlsl

+    template<typename Scalar>
+    complex_t<Scalar> getHigh(uint32_t highIdx); 
+
+    template<typename Scalar>
+    complex_t<Scalar> getMid(uint32_t midIdx); 
+
+    template<typename Scalar>
+    complex_t<Scalar> getLow(uint32_t lowIdx);
+
+    // Define twiddles and getters for each Scalar template specialization
+    #define float_t float32_t
+    #define TYPED_NUMBER(N) NBL_CONCATENATE(N, f) // to add f after floating point numbers and avoid casting warnings and emitting ShaderFloat64 Caps
+    #include <nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl>
+    #undef TYPED_NUMBER
+    #undef float_t 
+
+    #define float_t float64_t
+    #define TYPED_NUMBER(N) N
+    #include <nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl>
+    #undef TYPED_NUMBER
+    #undef float_t


don't bother precomputing twiddles, did this benchmark in shadertoy

const float _cos[32] = float[32]( 1.0,0.980785,0.92388,0.83147,0.707107,0.55557,0.382683,0.19509,-4.37114e-08,-0.19509,-0.382684,-0.55557,-0.707107,-0.83147,-0.92388,-0.980785,-1.0,-0.980785,-0.92388,-0.83147,-0.707107,-0.55557,-0.382683,-0.19509,1.19249e-08,0.19509,0.382684,0.55557,0.707107,0.83147,0.92388,0.980785 ); void mainImage( out vec4 fragColor, in vec2 fragCoord ) { // Normalized pixel coordinates (from 0 to 1) vec2 uv = fragCoord/iResolution.xy; // Time varying pixel color vec3 col = vec3(0.0); const int spins = 2000; for (int i=0; i<spins; i++) { int iarg = i;//+int(fragCoord.x)+int(fragCoord.y); float arg = float(iarg)/float(spins); float cs = cos(arg*6.2831853); col.rg += vec2(_cos[iarg]); } // Output to screen fragColor = vec4(col,1.0); }

perf dropped 50%, it gets worse the bigger the array gets

devshgraphicsprogramming · 2024-05-31T11:23:02Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+namespace hlsl
+{
+
+const float64_t PI = 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679;


please make a numbers.hlsl a counterpart to https://en.cppreference.com/w/cpp/numeric/constants

devshgraphicsprogramming · 2024-05-31T11:25:22Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+// Fast Fourier Transform
+template<typename T, uint32_t N, bool is_inverse>
+void fft_radix2(complex_t<T> data[N])
+{
+    const uint32_t log2N = uint32_t(log2(N));


why did you make this function?

I explicitly left behind a comment #592 (comment)

is this supposed to be a workgroup FFT? a subgroup FFT or what?

devshgraphicsprogramming · 2024-05-31T11:25:54Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+    // Bit reversal permutation
+    for (uint32_t i = 0; i < N; ++i)
+    {
+        uint32_t j = 0;
+        for (uint32_t bit = 0; bit < log2N; ++bit)
+            j |= ((i >> bit) & 1) << (log2N - 1 - bit);
+
+        if (j > i)
+        {
+            complex_t<T> temp = data[i];
+            data[i] = data[j];
+            data[j] = temp;
+        }
+    }


don't do the bit reversal for me, its often not needed

devshgraphicsprogramming · 2024-05-31T11:30:39Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+        if(is_inverse)
+        {
+            twiddle_factor = { cos(2.0 * PI / stride), sin(2.0 * PI / stride) };
+        }
+        else
+        {
+            twiddle_factor = { cos(-2.0 * PI / stride), sin(-2.0 * PI / stride) };
+        }


the cosine is the same, because its an even function

sine is an odd function

TL;DR its enough for you to do

const float arg = pi_v<T>*2.f; twiddle_factor = {cos(arg),sin(arg)}; if (is_inverse) twiddle_factor.imag = -twiddle_factor.imag; // compiler should optimize to a bitflip

devshgraphicsprogramming · 2024-05-31T11:31:52Z

include/nbl/builtin/hlsl/radix_fft.hlsl

+    // Cooley-Tukey
+    assign_mul_complex<T> mul_assign;
+    assign_div_complex<T> div_assign;
+    for (uint32_t stride = 2; stride <= N; stride *= 2)


I'd do a loop from stride=1 to stride=N/2 instead of 2->N because that's the actual stride between your butterfly elements, what you have right now is a width

devshgraphicsprogramming · 2024-05-31T11:32:57Z

include/nbl/builtin/hlsl/glsl_compat/subgroup_shuffle.hlsl

@@ -30,6 +30,14 @@ T subgroupShuffleDown(T value, uint32_t delta)
    return spirv::groupShuffleDown<T>(spv::ScopeSubgroup, value, delta);
 }

+
+template<typename T>
+T subgroupShuffleXor(T value, uint32_t delta)


does the spec call it a delta? I'd call it a mask

add complex.hlsl

448003b

nahiim added 3 commits November 3, 2023 16:33

rewrite

5b575e6

rewrite complex

c2c5ddb

const

8045a3b

devshgraphicsprogramming reviewed Nov 4, 2023

View reviewed changes

include/nbl/builtin/hlsl/complex.hlsl Outdated Show resolved Hide resolved

nahiim added 2 commits December 15, 2023 12:35

radix fft

6d775ff

if_inverse

8a8938a

devshgraphicsprogramming reviewed Dec 16, 2023

View reviewed changes

include/nbl/builtin/hlsl/complex.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming reviewed Dec 16, 2023

View reviewed changes

include/nbl/builtin/hlsl/complex.hlsl Outdated Show resolved Hide resolved

devshgraphicsprogramming reviewed Dec 16, 2023

View reviewed changes

wrap it up

b3f917e

devshgraphicsprogramming reviewed Dec 19, 2023

View reviewed changes

include/nbl/builtin/hlsl/glsl_compat/subgroup_shuffle.hlsl Outdated Show resolved Hide resolved

AnastaZIuk force-pushed the master branch 3 times, most recently from a07e79f to ffbd843 Compare January 18, 2024 21:24

devshgraphicsprogramming reviewed May 14, 2024

View reviewed changes

include/nbl/builtin/hlsl/complex.hlsl Outdated Show resolved Hide resolved

Merge branch 'master' into nahim_complex

d259f0f

Adds all functions up to before exponentials in the cppreference list.

189246b

Missing compound operators and template specializations for integer types. Arithmetic logic is off, different from the STL

devshgraphicsprogramming reviewed May 20, 2024

View reviewed changes

include/nbl/builtin/hlsl/complex.hlsl Outdated Show resolved Hide resolved

Fletterio added 5 commits May 20, 2024 15:49

Adds binary ops, withc complex specialization

5fb62bc

Adds binops (forgot to add in last commit) + subgroupSuffleXor

af38a18

Minor update to radix_fft to use new operators

35412f7

Precomputed twiddles done, with specialization for f32 and f64

c4cdece

Subgroup FFT done!

53e4adf

devshgraphicsprogramming reviewed May 30, 2024

View reviewed changes

devshgraphicsprogramming reviewed May 31, 2024

View reviewed changes

include/nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl Show resolved Hide resolved

devshgraphicsprogramming reviewed May 31, 2024

View reviewed changes

include/nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl Show resolved Hide resolved

devshgraphicsprogramming reviewed May 31, 2024

View reviewed changes

include/nbl/builtin/hlsl/fft/common/twiddles_impl.hlsl Show resolved Hide resolved

devshgraphicsprogramming reviewed May 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add complex.hlsl #592

add complex.hlsl #592

nahiim commented Nov 3, 2023

devshgraphicsprogrammingjenkins commented Nov 3, 2023

devshgraphicsprogramming Dec 16, 2023

devshgraphicsprogramming Dec 16, 2023

devshgraphicsprogramming May 14, 2024

Fletterio May 20, 2024

devshgraphicsprogramming May 23, 2024

devshgraphicsprogramming May 31, 2024 •

edited

devshgraphicsprogramming commented May 14, 2024 •

edited

devshgraphicsprogramming May 30, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

devshgraphicsprogramming May 31, 2024

add complex.hlsl #592

Are you sure you want to change the base?

add complex.hlsl #592

Conversation

nahiim commented Nov 3, 2023

Description

devshgraphicsprogrammingjenkins commented Nov 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devshgraphicsprogramming May 31, 2024 • edited

Choose a reason for hiding this comment

devshgraphicsprogramming commented May 14, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devshgraphicsprogramming May 31, 2024 •

edited

devshgraphicsprogramming commented May 14, 2024 •

edited