Error message "Unsupported HVX type: float32x32" #8170

jxl1080 · 2024-04-02T17:28:27Z

Hi,

I got the error message below when I run my generator with Adams2019 auto-scheduler, it doesn't happen if I run my generator without any auto-scheduler. But I don't really understand what it tells me and what I should do:

Unhandled exception: Internal Error at /glnxa64/Halide/src/HexagonOptimize.cpp:105 triggered by user code at : Unsupported HVX type: float32x32

Below is my Halide Generator Class:

#include "Halide.h"
#include <stdio.h>
#include
using namespace Halide;
class mMatmul_matmul_out1_fcn_halide_generator : public Halide::Generator <mMatmul_matmul_out1_fcn_halide_generator> {

public:
    Input<Buffer<float>> B1{"B1", 2};
    Input<Buffer<float>> A1{"A1", 2};
    Output<Buffer<float>> matmul_out1_fcn{"matmul_out1_fcn", 2};

    void generate() {
        RDom r(0, 100);
        matmul_out1(d1, d2) = sum(A1(d1, r) * B1(r, d2));
        matmul_out1_fcn(d1, d2) = matmul_out1(d1, d2);
    }

    void schedule() {
    // Schedule is determined by autoscheduler. Need to set estimate on buffer
        if(using_autoscheduler()) {
            B1.dim(1).set_estimate(0, 100);
            B1.dim(0).set_estimate(0, 100);
            A1.dim(1).set_estimate(0, 100);
            A1.dim(0).set_estimate(0, 100);
            matmul_out1_fcn.set_estimate(d1, 0, 100).set_estimate(d2, 0, 100);
        }  else {
            // Default schedule
        }
    }

private:
    Var d1{"d1"};
    Var d2{"d2"};
    Func matmul_out1{"matmul_out1"};

};
HALIDE_REGISTER_GENERATOR(mMatmul_matmul_out1_fcn_halide_generator, mMatmul_matmul_out1_fcn_halide_gen)

Thank you!

The text was updated successfully, but these errors were encountered:

abadams · 2024-04-02T17:31:43Z

The error means you're trying to compile to hvx, but your pipeline uses vectorized floats. I think our hexagon backend doesn't support the newer versions of hvx that support float vectors.

I think it isn't triggering without the autoscheduler, because then the schedule uses scalar floats only, which is fine. The autoscheduler isn't aware of that restriction on hexagon so it's trying to just vectorize everything.

jxl1080 · 2024-04-02T17:39:36Z

@abadams Thank you so much for your quick reply! Is there any suggestion on how to resolve this error message?

The error means you're trying to compile to hvx, but your pipeline uses vectorized floats. I think our hexagon backend doesn't support the newer versions of hvx that support float vectors.

I think it isn't triggering without the autoscheduler, because then the schedule uses scalar floats only, which is fine. The autoscheduler isn't aware of that restriction on hexagon so it's trying to just vectorize everything.

abadams · 2024-04-02T17:52:05Z

Don't try to do a floating point matrix multiply on hexagon. (Or at least the versions of hvx that Halide supports). It's not a good processor for running that algorithm, because you can't vectorize it. Do a fixed-point matrix multiply instead.

jxl1080 · 2024-04-02T19:56:51Z

@abadams Hi Adams, I'm not sure if I misunderstood your point by 'not try to do a floating point matrix multiply'. I changed my data type to 'uint8_t', but I'm getting a worse situation when I run my generator with Adams2019. There is a segmentation fault but without any error message.

abadams · 2024-04-02T21:18:53Z

Can you share a repro that crashes (including the build commands you're using)?

jxl1080 · 2024-04-03T14:46:13Z

Can you share a repro that crashes (including the build commands you're using)?

@abadams Thank you for your help! Below is the code of my Halide Generator Class:

#include "Halide.h"
#include <stdio.h>
#include
using namespace Halide;
class mMatmul_matmul_out1_fcn_halide_generator : public Halide::Generator <mMatmul_matmul_out1_fcn_halide_generator> {

public:
    Input<Buffer<uint8_t>> B1{"B1", 2};
    Input<Buffer<uint8_t>> A1{"A1", 2};
    Output<Buffer<uint16_t>> matmul_out1_fcn{"matmul_out1_fcn", 2};

    void generate() {
        RDom r(0, 100);
        matmul_out1(d1, d2) = sum(cast<uint16_t>(A1(d1, r))*cast<uint16_t>(B1(r, d2)));
        matmul_out1_fcn(d1, d2) = matmul_out1(d1, d2);
    }

    void schedule() {
    // Schedule is determined by autoscheduler. Need to set estimate on buffer
        if(using_autoscheduler()) {
            B1.dim(1).set_estimate(0, 100);
            B1.dim(0).set_estimate(0, 100);
            A1.dim(1).set_estimate(0, 100);
            A1.dim(0).set_estimate(0, 100);
            matmul_out1_fcn.set_estimate(d1, 0, 100).set_estimate(d2, 0, 100);
        }  else {
            // Default schedule
        }
    }

private:
    Var d1{"d1"};
    Var d2{"d2"};
    Func matmul_out1{"matmul_out1"};

};
HALIDE_REGISTER_GENERATOR(mMatmul_matmul_out1_fcn_halide_generator, mMatmul_matmul_out1_fcn_halide_gen)

I used binary 'Halide-17.0.1-x86-64-linux-52541176253e74467dabc42eeee63d9a62c199f6.tar.gz' downloaded from: https://github.com/halide/Halide/releases

My command for compiling the Halide Genertor Class is:
$ g++ mMatmul_matmul_out1_fcn_halide.cpp -std=c++17 ....../Halide-17.0.1-x86-64-linux/share/Halide/tools/GenGen.cpp -L ....../Halide-17.0.1-x86-64-linux/lib -lHalide -I ....../Halide-17.0.1-x86-64-linux/include -o mMatmul_matmul_out1_fcn_halide

My command for running generator with Adams2019 is (which gave me segmentation fault):
$ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:....../Halide-17.0.1-x86-64-linux/lib
$ ./mMatmul_matmul_out1_fcn_halide -f myPipeline -g mMatmul_matmul_out1_fcn_halide_gen -e h,o target=hexagon-32-noos-hvx-no_runtime autoscheduler.parallelism=2 autoscheduler=Adams2019 -p ....../Halide-17.0.1-x86-64-linux/lib/libautoschedule_adams2019.so -o ./

My command for running generator with no auto-scheduler (which worked for me):
$ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:....../Halide-17.0.1-x86-64-linux/lib
$ ./mMatmul_matmul_out1_fcn_halide -f myPipeline -g mMatmul_matmul_out1_fcn_halide_gen -e h,o target=hexagon-32-noos-hvx-no_runtime -o ./

abadams · 2024-04-11T20:01:34Z

Looks like it's a compiler bug caused by the adams autoscheduler not really understanding what to do on hexagon, and producing some very strange code that then hit a corner case bug in the simplifier.

Let's use the human Adams autoscheduler instead. A reasonable schedule for this pipeline is:

matmul_out1_fcn.vectorize(d1, 128).parallel(d2, (B1.dim(1).extent() + 3) / 4);

but a more typical matmul schedule (for large matrices) is

   void generate() {
        RDom r(0, 100);
        // Note: changed from sum to += so that I can schedule the reduction var
        matmul_out1(d1, d2) += cast<uint16_t>(A1(d1, r)) * cast<uint16_t>(B1(r, d2));
        matmul_out1_fcn(d1, d2) = matmul_out1(d1, d2);

        Var d1i, d2i, d1o, d2o;
        matmul_out1_fcn.tile(d1, d2, d1o, d2o, d1i, d2i, 3 * 128, 4).vectorize(d1i, 128).unroll(d1i).unroll(d2i).parallel(d2o);
        matmul_out1.compute_at(matmul_out1_fcn, d1o).vectorize(d1, 128).unroll(d1).unroll(d2);
        matmul_out1.update().reorder(d1, d2, r).vectorize(d1, 128).unroll(d1).unroll(d2);
    }

I usually do my scheduling inside the generate() method. In this case I needed to to access the RDom. You could also make the RDom a class member instead of a local.

For a great schedule, you need to start worrying about things like managing dmas into Hexagon's cache.

abadams mentioned this issue Apr 11, 2024

Fix corner case in if_then_else simplification #8189

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error message "Unsupported HVX type: float32x32" #8170

Error message "Unsupported HVX type: float32x32" #8170

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 3, 2024

abadams commented Apr 11, 2024

Error message "Unsupported HVX type: float32x32" #8170

Error message "Unsupported HVX type: float32x32" #8170

Comments

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 2, 2024

abadams commented Apr 2, 2024

jxl1080 commented Apr 3, 2024

abadams commented Apr 11, 2024