Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.

Add GpuSharedMemory Tests for Intel GPU #537

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lmontigny
Copy link
Contributor

Problem
We need specific tests for Shared Memory on Intel GPU.
Currently we have only shared memory test for Nvidia GPU with CUDA.

Solution
This PR adds the ./Tests/GpuSharedMemoryTestIntel executable.
I took the GpuSharedMemoryTest.cpp code and adapted it from Nvidia CUDA to L0

Impact
It will allows us to test our SM implementation, work in progress #534

@kurapov-peter
Copy link
Contributor

This mostly the same as the original smem test, right? The only real difference is the kernel submission. Can we get rid of code duplication for all the codegen and helpers methods?

Copy link
Contributor

@kurapov-peter kurapov-peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mark it as "enabling" for it to become a part of regular testing.

@lmontigny
Copy link
Contributor Author

  • Fixed style-check for CI
  • Need to fix ./Tests/GpuSharedMemoryTestIntel

@lmontigny lmontigny self-assigned this Jul 4, 2023
@lmontigny
Copy link
Contributor Author

lmontigny commented Jul 5, 2023

Current issue with GpuSharedMemoryTestIntel executable:

$ ./Tests/GpuSharedMemoryTestIntel 
[==========] Running 7 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 7 tests from SingleColumn
[ RUN      ] SingleColumn.VariableEntries_CountQuery_4B_Group
Adding global slm.buf.i64
PerformReductionTest - after linking
PerformReductionTest - before calling conv
PerformReductionTest - after calling conv
PerformReductionTest - before linking
compile_and_link_gpu_code - before writeSpirv
compile_and_link_gpu_code - before spv_to_bin
unknown file: Failure
C++ exception with description "L0 error: error occurred when building module, see build log for details" thrown in the test body.
[  FAILED  ] SingleColumn.VariableEntries_CountQuery_4B_Group (70 ms)

hdk_log:

 2023-07-05T10:12:06.622687 I 3915859 0 0 L0Mgr.cpp:76 Discovered 1 driver(s) for L0        platform.
 2023-07-05T10:12:06.661611 I 3915859 0 0 L0Mgr.cpp:229 L0 module build log: error: IGC     SPIRV Consumer does not support the following opcode: 65534
 ^@

IR: https://gist.github.com/lmontigny/3663842110065557e18df09ea5f658d5

@kurapov-peter Can you have a look the IR & GpuSharedMemoryTestIntel.cpp around init_smem_func L134? I took the GpuSharedMemoryTest.cpp (cuda) and modified to support L0 - I'm not sure it's 100% correct.

@kurapov-peter
Copy link
Contributor

I see that none of the functions except for the wrapper_kernel have a proper calling convention set. If that's the final representation of the kernel you convert to spirv I'd assume the error you are seeing is related, e.g. it tries to execute generate a call instruction with the default calling conv and doesn't find such in the isa.

@kurapov-peter
Copy link
Contributor

Another thing to check is that your group-by buffer size does not produce indexes that are out of the allocated SLM.

@kurapov-peter
Copy link
Contributor

The last thing that caught my eye is that there's a declaration of the init_shared_mem that doesn't match the definition, so you might be calling a non-existent function.

@kurapov-peter
Copy link
Contributor

Oh, and I think init_smem_func uses global address space instead of the shared one. This is the most probable reason you get the error.

@@ -0,0 +1,813 @@
/*
* Copyright 2019 OmniSci, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one should be fixed I guess

@lmontigny
Copy link
Contributor Author

Previous Spirv IGC opcode issue fixed internally.
Now looking at C++ exception with description "L0 error: kernel name is not found in the module" thrown in the test body

@lmontigny
Copy link
Contributor Author

The last thing that caught my eye is that there's a declaration of the init_shared_mem that doesn't match the definition, so you might be calling a non-existent function.

In the latest IR with the IGC fix, I don't have any declaration for init_shared_mem, is it normal? attached IR

@lmontigny
Copy link
Contributor Author

Fixed large portion of the codegen at different level, executable now running.

Next step is to fix the computation to have correct results. (currently cpu != gpu results)

hdk/omniscidb/Tests/GpuSharedMemoryTestIntel.cpp:416: Failure
Expected equality of these values:
cmp_result
Which is: 1
0
[ FAILED ] SingleColumn.VariableEntries_CountQuery_4B_Group (247 ms)
[----------] 1 test from SingleColumn (247 ms total)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants