-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push constant proposal #4612
base: main
Are you sure you want to change the base?
Push constant proposal #4612
Conversation
Previews, as seen when this build job started (f15e36a): |
proposals/push-constants.md
Outdated
dictionary GPUPipelineLayoutDescriptor | ||
: GPUObjectDescriptorBase { | ||
required sequence<GPUBindGroupLayout> bindGroupLayouts; | ||
uint32_t pushConstantsSize = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be explicit what size this is (byte size? number of 32-bit parameters?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Byte size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pushConstantBytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to immediateDataRangeByteSize
:)
proposals/push-constants.md
Outdated
uint32_t pushConstantsSize = 0; | ||
}; | ||
``` | ||
two pipeline layouts are defined to be “compatible for push constants” if they were created with identical push constant size. It means push constants values can share between pipeline layouts that are compatible for push constants. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it's a compile error for a module to use push constants outside of its defined size range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts is that we follow out-of-bounds access here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can validate that the shader module uses no more than the push constants allowed by the module so we should be fine. Also dynamic indexing of push constants is not possible so in the shader we know that all accesses are in bounds.
proposals/push-constants.md
Outdated
| --- | --- | --- | --- | --- | --- | --- | | ||
| `var<push_constant>` | Immutable | [Module](https://www.w3.org/TR/WGSL/#module-scope) | [Concrete](https://www.w3.org/TR/WGSL/#type-concrete) [constructible](https://www.w3.org/TR/WGSL/#constructible) [host-shareable](https://www.w3.org/TR/WGSL/#host-shareable) excludes [array types](https://www.w3.org/TR/WGSL/#array-types) and [structure types](https://www.w3.org/TR/WGSL/#struct-types) contains array members | Disallowed | | Yes. [uniform buffer](https://www.w3.org/TR/WGSL/#uniform-buffer) | | ||
|
||
NOTE: Each [entry point](https://www.w3.org/TR/WGSL/#entry-point) can statically use at most one push constant variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why, exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make implementation a simpler. So we don't need to introduce @offset
in wgsl to describe multiple push constant variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce a single push constant interface we should consider an alternate design of making it a parameter to the entry point (identified via an attribute). If you did consider it, what did you see as the draw backs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would prefer push constants to be like bindings, and not tied to a specific entry point. I tend to keep my bindings/push constants in a separate file I can reuse across passes like so.
HLSL seems to have requirements around having push constants be structs anyways gfx-rs/wgpu#5571
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was my strawman in #75 (comment) considered at all? If we assume 4-byte elements and alignment, I think this could become a lot easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@magcius in that proposal, where are @parameter
variable declared? At the global scope of as function arguments? It seems very similar to specifying an offset, except that it stays in "words" instead of bytes. It'd be nice to support smaller data types in push constants eventually so IMHO it's still be better to act on bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce a single push constant interface we should consider an alternate design of making it a parameter to the entry point (identified via an attribute). If you did consider it, what did you see as the draw backs?
I agree it is an optional design. Personally, I'm slightly prefer current address space mode proposal because:
- It defines a global scope variable (like uniform), so that the helper functions could use it directly. ( instead of accepting attributes as parameter, if we pass push constant variable as entry point function parameter)
- From API aspect,
pushConstantBytes
is related withPipelineLayout
rather thanShaderModule
. Take it as a special binding resource is natural.
proposals/push-constants.md
Outdated
|
||
```javascript | ||
interface GPUCommandEncoder { | ||
void setPushConstantU32(uint32_t offset, uint32_t value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What units is offset
in? If bytes, are there any restrictions? If I do setPushConstantU32(1, 0x12345678);
, will it overwrite bytes 1, 2, 3, 4 and leave byte 0 alone? Should there be alignment restrictions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same as bufferOffset )
It is in bytes, and it should be multiple of 4 bytes. And the alignment should follow this(https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size). So it means a the pushconstantSize = variable sizes + padding sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about f16? Should it just be set via a packed shared buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F16 should be set via packed shared buffer, or setPushConstantU32
so that we don't change the bits.
proposals/push-constants.md
Outdated
void setPushConstantU32(uint32_t offset, uint32_t value); | ||
void setPushConstantI32(uint32_t offset, int32_t value); | ||
void setPushConstantF32(uin32_t offset, float32_t value); | ||
void setPushConstants(uint32_t offset, uint32_t count, ArrayBuffer data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably take a AllowSharedBufferSource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could.
proposals/push-constants.md
Outdated
|
||
| Address space | Sharing among invocations | Default access mode | Notes | | ||
| --- | --- | --- | --- | | ||
| `push_constant` | Invocations in [shader stage](https://www.w3.org/TR/WGSL/#shader-stages) | [read](https://www.w3.org/TR/WGSL/#access-read) | For [uniform buffer](https://www.w3.org/TR/WGSL/#uniform-buffer) variables exclude [array types](https://www.w3.org/TR/WGSL/#array-types) variable or [structure types](https://www.w3.org/TR/WGSL/#struct-types) variable contains [array types](https://www.w3.org/TR/WGSL/#array-types) attributes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the uniform buffer reference a copy paste error?
Separately, are arrays excluded to avoid Vulkan's default uniform access requirements? It will seem arbitrary to users unfamiliar with Vulkan. Is this a better resolution than requiring non-uniform indexing (probably)? Is it better than having a uniformity requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the uniform buffer reference a copy paste error?
No, I think uniform buffer
variable defined in wgsl aligns with what I want with push constant
. I think push constant variables
should be host-shareable constructible type, and must satisfy the address space layout constraints.
Separately, are arrays excluded to avoid Vulkan's default uniform access requirements?
Not only Vulkan, but also DX12. DX12 says Arrays are not permitted in constant buffers that get mapped onto root constants since dynamic indexing in the root signature space is not supported (https://learn.microsoft.com/en-us/windows/win32/direct3d12/using-constants-directly-in-the-root-signature)
proposals/push-constants.md
Outdated
| --- | --- | --- | --- | --- | --- | --- | | ||
| `var<push_constant>` | Immutable | [Module](https://www.w3.org/TR/WGSL/#module-scope) | [Concrete](https://www.w3.org/TR/WGSL/#type-concrete) [constructible](https://www.w3.org/TR/WGSL/#constructible) [host-shareable](https://www.w3.org/TR/WGSL/#host-shareable) excludes [array types](https://www.w3.org/TR/WGSL/#array-types) and [structure types](https://www.w3.org/TR/WGSL/#struct-types) contains array members | Disallowed | | Yes. [uniform buffer](https://www.w3.org/TR/WGSL/#uniform-buffer) | | ||
|
||
NOTE: Each [entry point](https://www.w3.org/TR/WGSL/#entry-point) can statically use at most one push constant variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce a single push constant interface we should consider an alternate design of making it a parameter to the entry point (identified via an attribute). If you did consider it, what did you see as the draw backs?
proposals/push-constants.md
Outdated
|
||
```javascript | ||
interface GPUCommandEncoder { | ||
void setPushConstantU32(uint32_t offset, uint32_t value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about f16? Should it just be set via a packed shared buffer?
proposals/push-constants.md
Outdated
Four new functions in `GPUCommandEncoder`. | ||
|
||
```javascript | ||
interface GPUCommandEncoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like these commands should either take a pipeline or be on render/compute encoders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On render/compute encoders. And that's why I tend to add these APIs in GPUCommandEncoder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands wouldn't all pipelines need the same push constants. That's more what I was getting at of needing to associate the command with a particular pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPUCommandEncoder is the top-level encoder. If you want to add these methods to render/compute encoders you need to add them to GPUBindingCommandsMixin (assuming they be used in render bundles).
We could associate push constant values with a particular pipeline by passing them as part of setPipeline()
(kind of parallel to overrides in createPipeline()
), but since IIUC setPushConstant()
is often going to be called multiple times per setPipeline()
, I think it makes more sense to make it parallel setBindGroup()
: there are separate states for the current {pipeline, bind groups, push constants} and we capture all the current states each time we hit a draw/dispatch call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands wouldn't all pipelines need the same push constants. That's more what I was getting at of needing to associate the command with a particular pipeline.
Yes. My expected use case is:
passEncoder.setPipeline().
passEncoder.setBindGroup();
passEncoder.setPushConstant();
...
// setBindGroup();
// setPushConstant();
If you want to add these methods to render/compute encoders you need to add them to GPUBindingCommandsMixin (assuming they be used in render bundles).
Thanks! That's the place I expected.
proposals/push-constants.md
Outdated
void setPushConstantU32(uint32_t offset, uint32_t value); | ||
void setPushConstantI32(uint32_t offset, int32_t value); | ||
void setPushConstantF32(uin32_t offset, float32_t value); | ||
void setPushConstants(uint32_t offset, uint32_t count, AllowSharedBufferSource data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kelsey points out we would want to add a dataOffset and dataCount (like other entry points that take BufferSources) so we can point directly into a larger array buffer instead of copying it into a new ArrayBuffer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the API shape to void setImmediateDataRange(uint32_t rangeOffset, AllowSharedBufferSource data, optional dataOffset, optional size);
proposals/push-constants.md
Outdated
void setPushConstantU32(uint32_t offset, uint32_t value); | ||
void setPushConstantI32(uint32_t offset, int32_t value); | ||
void setPushConstantF32(uin32_t offset, float32_t value); | ||
void setPushConstants(uint32_t offset, uint32_t count, AllowSharedBufferSource data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks good, but I think we only need the last signature taking an AllowSharedBufferSource
and not the other overloads (U32, I32, F32, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But tend to keep these U32, I32, F32 for convenience , especially we tend to support small amount of data uploading.
I think this will have impact on the already complex matrix of GPUSupportedLimits. Have we considered making setVertexBuffer take an ArrayBufferSource? Then we don't need to change the limits. I didn't take a look at the D3D or Vulkan backends yet so maybe that is the motivation? On Metal, this is as simple as calling setVertexBytes: / setFragmentBytes: |
Assuming that push constants would be implemented using D3D12's root constants... If I'm not mistaken, D3D12 has a fixed (per-device?) amount of space in the root signature. In shipping WebGPU, the WebGPU implementation gets to decide how to use that space. For example, if the WebGPU Pipeline Layout includes 2 textures, the WebGPU implementation can place those 2 texture descriptors directly inside the D3D12 root signature. Or, alternatively, the WebGPU implementation can choose to put both texture descriptors into a descriptor table, and make the root signature point to the descriptor table. The WebGPU implementation has freedom here to use this fixed root signature space as it desires. Okay, now let's add in support for push constants. Push constants would also live directly within the fixed root signature space. But what if all the root signature space is already occupied (e.g. by those texture descriptors from the above paragraph)? Even if WebGPU only allows a single push constant, that doesn't actually solve the problem. The root signature space may be 100% occupied by descriptors, leaving no space at all for push constants. There are a variety of potential solutions to this problem:
|
Previously conducted performance analysis of push constants: |
GPU Web WG 2024-05-08 Pacific-time
|
Just point out possible implementation issues I observed:
|
It sounds like D3D and Vulkan can be optimized to not use a buffer when the push constants are small enough and there is space. With Metal, we always need to reserve a buffer slot for the push constants. Though it would be possible to map them all into one buffer slot. Abstracting this concept seems to further justify that the term 'push constants' does not necessarily need to exist in WebGPU. Rather, the D3D and Vulkan backends can implement small, dynamic setBindGroup calls into push constants and on Metal, we would use set[Vertex/Fragment]Bytes unless the data is large, then we would make a buffer |
If you tie push constants to specifically render pipelines, then you can't use them for compute shaders (and later, other types of pipelines), which would be unfortunate. |
Yeah, people in this thread are using a variety of different terminology. Above, when people say "setVertexBytes()" or "setFragmentBytes()," I believe they are referring to the Metal functions of the same name, as an example of a function that takes immediate data (rather than a method taking resources or bind groups). Even in Metal, there's a way to supply immediate data to compute pipelines. |
tl;dr: Thinking deeper, I think the only way it can work in Vulkan is either to just not add the feature at all, or add an explicit API for it. I think the biggest unanswered question here regarding "let push constants be an implementation detail" is this: Consider such a program:
The question is: What should the code in the shader be to handle both of these cases? In the first case, we'd want to promote the In SPIR-V, the text of the shader is actually different depending on whether or not you're accessing a push constant or not. For example, in GLSL (which I'm using here just as a readable form of SPIR-V), you'd either have this:
Or this:
If we operate under the constraint that no amount of recompilation is allowed to happen while commands are being recorded into the render pass - not even supplying specialization constants - then the only way this can work is if both blocks are present in the shader, and each access to the data is guarded by a flag - in software - to determine whether the data was bound to Assuming it is in fact a dealbreaker, it seems like the only way this could ever work (again, assuming that the committee does want to make it work, which I'm still not convinced has been sufficiently justified) would be to come up with a system where the implementation can guarantee that the immediate data being set by the app will fit into push constant space. This is why I suggested a But there's even an additional problem - what about the following program:
How should this work, assuming no amount of recompilation is acceptable? Imagine the bind group contains a buffer whose contents is populated by a previous compute shader - the CPU has never seen this data, and therefore such data cannot be hoisted to push constants. There would have to be some way, at the API level, to distinguish the immediate data from the bind group data, so that the shader wouldn't have to handle both cases in software at runtime. I think the only way this could work is to add a new type of descriptor in the bind group layout. That way, an entry in the bind group layout could either refer to a buffer, or refer to immediate data, but not both. Any data set by immediates for one draw call would be incapable of being clobbered by a GPU-populated buffer for a subsequent draw call - thereby allowing the shader compiler to know that it's safe to only emit the So then that begs the question: What should this new type of descriptor in the bind group layout be named? Perhaps it should be named ... "push constant"... |
This is turning into a bit of a bikeshed opportunity, but I'll add my two cents to the jar:
To give a bit of an unprompted history lesson and awareness of how these features are implemented in practice... The simplified AMD interpretation of push constants (not applicable to all drivers and hardware, just giving one mental model): The hardware has some general-purpose 32-bit scalar registers (SGPRs); these registers are active storage for threads, but also can be initialized with user constants set up by the hardware that launches threads. Some of these registers are used by the driver itself for internal purposes (e.g. the "base draw offset" feature from Vulkan would get passed by an SGPR), some of them are used by the driver to implement binding models (D3D12 CBVs might get mapped directly to an SGPR), and some of them are used to store memory addresses to binding tables (e.g. if you have a lot of textures, the driver will allocate a block of memory which has the texture descriptors, and store a pointer to the table in an SGPR; these are D3D12's "descriptor tables"). The D3D12 idea was that all root signature parameters (root constants, CBVs, descriptor tables) would map to SGPRs; however, as the driver sometimes needs some for its own purposes, this isn't a 1:1 mapping, and sometimes the driver has to jettison some of the parameters to a memory block and do some finagling to rearrange all of the binding updates. Note that since the usage of an SGPR is arbitrary (it can contain a memory address or a value, just depending on how it's used), the usage needs to be determined at root signature time; in practice, the AMD compiler effectively reserves a few driver SGPRs that work across all platforms. So, in order to make "transparent push constants" work, we can't simply replace the first N bytes of a uniform buffer of a bind group with immediate data; using it with a bind group will require a pointer in the SGPRs, while using it with push constants will require the values in SGPRs. Instead, we'll need syntax that lets us optimistically compile the first N push constants to SGPRs, and then the rest get jettisoned to a memory buffer. This is enabled a bit more by the extra pipeline layout field so we know up-front what will fit, but it still seems more complicated than just having a small fixed-size limit in the specification. We can always expand to support bigger sizes through emulation later. |
I think we have to be careful here - if the biggest use case is simply "make some indices available to a shader" then a simple API convenience method would be sufficient, without changing the binding model or adding push constants at all. |
(I just realized that we already have that simple API convenience method: |
A quick thoughts is that developer don't need to call |
writeBuffer cannot be used while inside of a render pass, as it executes at the queue level |
Aha! You’re right. So I guess the alternative would be “something like writeBuffer but works in a render pass” |
There is no way to support writeBuffer-but-it-works-in-a-render-pass easily in D3D12 and Vulkan. You can support a limited amount of data being sent immediately to the shader through D3D12 root constants and Vulkan push constants, but that limit is fairly small, and it requires special shader changes. One alternative to push constants on Vulkan is dynamic uniform buffer offsets, but those have heavy limitations on alignment that can be detrimental. If you go down the end of this road, you end up at OpenGL uniforms/uniform buffers, and D3D11 constant buffers, both of which have numerous, well-documented pitfalls and performance hazards. The easiest and safest way to use D3D11 constant buffers were with the DISCARD/NOOVERWRITE flags which effectively turned them into our dynamic uniform buffers. They were replaced in D3D12/Vulkan.
What kind of features are you thinking of? |
proposals/push-constants.md
Outdated
dictionary GPUPipelineLayoutDescriptor | ||
: GPUObjectDescriptorBase { | ||
required sequence<GPUBindGroupLayout> bindGroupLayouts; | ||
uint32_t pushConstantsByte = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is pushConstantsByte
? Should this be pushConstantsSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, maybe pushConstantsBytes
? The previous comments is that we don't know the pushConstantSize
is defined as bytes or number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to immediateDataRangeByteSize
:)
proposals/push-constants.md
Outdated
@@ -0,0 +1,73 @@ | |||
# PushConstants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really use the term 'push constants' which seems like a non-universal, backend specific name as opposed to something like 'immediate data'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
We could also make setBindGroup take an ArrayBuffer and document that the corresponding pipeline's immediate / push constant size in bytes must be >= size in bytes of the given ArrayBuffer or ArrayBuffers bound using this immediate mode. I didn't check the D3D specifics but that seems like it would work in Vulkan and Metal |
My understanding is that the benefit of this is we don't introduce new API name? |
Recapping a bunch of the discussion above (hopefully the non-controversial points):
The "immediate" terminology @litherum used is nice because it is related to what the developer intends to do (immediately set this piece of data) (also similar to immediates in assembly instructions like "move") instead of a potential hardware mechanism (push constants). IMHO it would be better to have immediates set directly instead of via an aggregate command like @mwyrzykowski isn't your implementation of WebGPU using |
Yes we already call setBytes/setVertexBytes/setFragmentBytes. This is a good point, we could reuse the slot assuming this is limited to a certain number of bytes (like 1024 or less). So then no impact to the limits 👍 I agree the term 'immediate' is very much preferred over 'push constants'. |
I don't think this has been proven yet. This would be true if:
|
You're right that, from within a render pass, it's impossible* to update a buffer in a well-defined way (where "well-defined" means: a set of known draw calls from early in the render pass don't see the update, and a set of draw calls from later in the render pass do see the update). (*For some definition of "impossible.") Therefore, if the goal was just convenience, it would have to work like this:
(Note that the second call to |
I'm not sure that |
Sorry, I don't think I was being clear. From where I stand, it looks like there are these possible paths forward:
If performance is the goal (e.g. performance with push constants is shown to be consistently better than buffers on a variety of vendors' cards), and it's demonstrated that a software flag to switch between both codepaths is too slow, then option 5 is the only acceptable option. However, if convenience is the goal, or a software codepath flag is fast enough, options 2-4 are better than options 0 or 5. |
If e.g. Chrome does this but WebKit does not, this means that performance can vary rapidly between vendor, leading to a real issue with performance portability. Making portable web applications that run acceptably across vendors is already hard enough. While I don't think there's any spec guarantee that createShaderModule/createRenderPipeline only compiles a single shader, I think it's a reasonable assumption. Having web vendors spend precious startup time compiling multiple variations of the same shader "just in case" seems silly when I could instead tell the API what I want. From that perspective, options 2, 3, and 4 are unattractive to me as a web developer and would be more inconvenient to me than options 0 and 5, because now I have multiple ways of doing something and the performance characteristics can change not just based on the GPU vendor, but also the browser vendor as well. |
This seems to assume that performance is the motivation for push constants, which I'm not sure is true. But, if it is, you are correct, so long as the spec doesn't describe what the hoisting heuristic is (which the spec could do!)
None of the options I listed involve a single call to
I think this is reductive. The amount of push constants space in Vulkan will be different than the amount of root constant space in D3D12 (which will be different than any potential driver optimizations of |
I'll run my RootConstant Sample on various vendors GPUs to see the results.
I'm still concern about the implementation burdens we take. Especially on Vulkan backend we don't have extra descriptor set slot for implicit uploading buffer. And a switch-branch in shader has concerns too. I think this feature doesn't want to encourage developers to uploading large data. That's why I'm not a fan of 3, 4. I don't want to provide a method that hints people "We can upload large amount of data in render/compute passes". To avoid confusing developers about "which uploading approach should I choose ?", a LIMIT helps hint "Immediate data" is only for small amount of data, even we do a silently fallback in implementation in final state (For some large limit size). |
Push constants proposal. Related issue #75