-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subtract position from Vulkan inter stage limits #4518
base: main
Are you sure you want to change the base?
Conversation
Previews, as seen when this build job started (760a32a): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, good catch. LGTM, matches my summary of the investigation/proposal. (Suggested edits to point directly at it.)
<td>[#1962](https://github.com/gpuweb/gpuweb/issues/1962) | ||
<td>`min(maxVertexOutputComponents // 4, maxFragmentInputComponents // 4)` | ||
<td>[#1962](https://github.com/gpuweb/gpuweb/issues/1962#issuecomment-1136316791) | ||
<td>`min(maxVertexOutputComponents, maxFragmentInputComponents) // 4 - 1` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I think this change doesn't align with my summary:
maxInterStageShaderVariables (straw-person name). Base value 16.
Counting vertex outputs:
- All locations must be less than the limit.
- Add 1 for each used location.
Counting fragment inputs:
- All locations must be less than the limit.
- Add 1 for each used location.
(Note there's no location cost to using theposition
input3,front_facing
, orsample_index
1.)Limit derivation by backend (backend limit minus base cost):
Vulkan: Ranges from 16 to 32. Min of:
maxVertexOutputComponents / 4
(no base cost)maxFragmentInputComponents / 4
(no base cost)
3 Only D3D12 counts the fragment input
position
toward its limit. Because the limit can't go higher than 31 anyway (D3D12_VS_OUTPUT_REGISTER_COUNT
is fixed at 32, minus one for the vertex outputposition
which is always present), we also subsume the fragment outputposition
into the "base cost" for the limit, without detriment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that summary is self-inconsistent. If both limits are based on Vulkan's maxVertexOutputComponents/maxFragmentInputComponents, then they both have to account for position (either in the WebGPU spec accounting, or statically by adjusting the limit derivation for Vulkan backends).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wait, I remember why the summary says that. It's because the Vulkan spec actually uses this limit for two arguably separate restrictions: max...Components
components, and max...Components / 4
locations. This one paragraph in this spec section defines both:
The number of input and output Location slots available for a shader input or output interface is limited, and dependent on the shader stage as described in Shader Input and Output Locations.
This is an intro to both restrictions.
All variables in both the built-in interface block and the user-defined variable interface count against these limits.
This is about the number of components across user-defined AND built-in variables.
Each effective Location must have a value less than the number of Location slots available for the given interface, as specified in the “Locations Available” column in Shader Input and Output Locations.
I thought this restricted the number of user-defined variables. But it is actually about the indices of user-defined variables. With the accounting fix to count every variable as 4 components, this actually does not restrict the number of user-defined variables any further than the first restriction.
In any case, it is consistent with our maxInterStageShaderVariables
which is also just a restriction on the indices of user-defined variables.
So I think this change isn't needed.
<td>`min(maxVertexOutputComponents, maxFragmentInputComponents) // 4 - 1` | |
<td>`min(maxVertexOutputComponents, maxFragmentInputComponents) // 4` |
In the spec, I guess there should probably be an adapter capability guarantee something like
maxInterStageShaderVariables * 4
≤ maxInterStageShaderComponents
≤ maxInterStageShaderVariables * 4 + 3
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but is it valuable to have it slightly higher?
Users will run into maxInterStageShaderComponents
first anyway.
No description provided.