Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dxc generates invalid alignment on groupshared matrix load/store instructions #6416

Closed
dmpots opened this issue Mar 14, 2024 · 0 comments · Fixed by #6589
Closed

dxc generates invalid alignment on groupshared matrix load/store instructions #6416

dmpots opened this issue Mar 14, 2024 · 0 comments · Fixed by #6589
Assignees
Labels
bug Bug, regression, crash correctness Bugs that impact shader correctness matrix-bug Bugs relating to matrix types

Comments

@dmpots
Copy link
Member

dmpots commented Mar 14, 2024

Description
When loading or storing a matrix to groupshared dxc is using an alignment on the instruction that is not guaranteed by the underlying global variable.

Steps to Reproduce
https://godbolt.org/z/9c9os31MW

// dxc /Tcs_6_0 alignbug.hlsl
struct Data {
   float4x4 m;
};

groupshared Data GData;
StructuredBuffer<Data> input : register(t0);
RWStructuredBuffer<Data> output : register(u0);

[RootSignature("SRV(t0), UAV(u0)")]
[numthreads(128,1,1)]
void main(uint Id : SV_DispatchThreadId, uint g : SV_GroupID)
{
   GData = input[0];
   GroupMemoryBarrierWithGroupSync();
   output[Id] = GData;

}

Actual Behavior
Compiling this shader produces loads and stores that do not match the alignment of the underlying global.

The global does not specify an alignment so it will get the default alignment of 4 for this type

@"\01?GData@@3UData@@A.0.v.v" = addrspace(3) global [16 x float] undef

But the load

%22 = load float, float addrspace(3)* getelementptr inbounds ([16 x float], [16 x float] addrspace(3)* @"\01?GData@@3UData@@A.0.v.v", i32 0, i32 0), align 16

and store

store float %3, float addrspace(3)* getelementptr inbounds ([16 x float], [16 x float] addrspace(3)* @"\01?GData@@3UData@@A.0.v.v", i32 0, i32 0), align 16

both specify an alignment of align 16 for the first element (other elements are overaligned as well).

From the llvm language reference manual, the aligment on the instruction should be guaranteed by somehow, but the alignment of the global does not do it.

The optional constant align argument specifies the alignment of the operation (that is, the alignment of the memory address). It is the responsibility of the code emitter to ensure that the alignment information is correct.

Environment

  • DXC version dxc main as of 3/14/2024
  • Host Operating System windows
@dmpots dmpots added bug Bug, regression, crash needs-triage Awaiting triage labels Mar 14, 2024
@damyanp damyanp changed the title dxc generates invalid alignment on groupshared matric load/store instructions dxc generates invalid alignment on groupshared matrix load/store instructions Mar 15, 2024
@damyanp damyanp removed the needs-triage Awaiting triage label Mar 15, 2024
@damyanp damyanp added this to the Next Release milestone Mar 15, 2024
@damyanp damyanp added matrix-bug Bugs relating to matrix types correctness Bugs that impact shader correctness labels Mar 15, 2024
pow2clk added a commit to pow2clk/DirectXShaderCompiler that referenced this issue May 6, 2024
When flattening the global for a groupshared matrix, the alignment
information was getting lost. As a result, the alignments of the loads
and stores were calculating their own alignment based on preferred
alignment and trailing zeros of the index. The preferred alignment
switched to 16 when the type size was over 128 bits due to a heuristic
whose rationale is lost to time. When the global has its own alignment,
that gets used, so by retaining it through lowering, the alignments are
consistent and reliable.

Includes testing for a few matrix variants

fixes microsoft#6416
pow2clk added a commit that referenced this issue May 22, 2024
When flattening the global for a groupshared matrix, the alignment
information was getting lost. As a result, the alignments of the loads
and stores were calculating their own alignment based on preferred
alignment and trailing zeros of the index. The preferred alignment
switched to 16 when the type size was over 128 bits due to a heuristic
whose rationale is lost to time. When the global has its own alignment,
that gets used, so by calculating it at lowering, the alignments are
consistent and reliable.

Includes testing for a few matrix variants and a pass test.

fixes #6416
pow2clk added a commit to pow2clk/DirectXShaderCompiler that referenced this issue May 22, 2024
…rosoft#6589)

When flattening the global for a groupshared matrix, the alignment
information was getting lost. As a result, the alignments of the loads
and stores were calculating their own alignment based on preferred
alignment and trailing zeros of the index. The preferred alignment
switched to 16 when the type size was over 128 bits due to a heuristic
whose rationale is lost to time. When the global has its own alignment,
that gets used, so by calculating it at lowering, the alignments are
consistent and reliable.

Includes testing for a few matrix variants and a pass test.

fixes microsoft#6416

(cherry picked from commit a6f4025)
@pow2clk pow2clk removed this from the May 2024 Release milestone May 22, 2024
@damyanp damyanp added this to the Next 2024 Release milestone May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug, regression, crash correctness Bugs that impact shader correctness matrix-bug Bugs relating to matrix types
Projects
Status: Done
Status: Triaged
Development

Successfully merging a pull request may close this issue.

3 participants