-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nahim shader defines #643
base: master
Are you sure you want to change the base?
Nahim shader defines #643
Conversation
#define NBL_GENERATE_GET_OR_DEFAULT(field, ty, default) \ | ||
template<typename S, bool = has_member_##field<S>::value> struct get_or_default_##field : integral_constant<ty,S::field> {}; \ | ||
template<typename S> struct get_or_default_##field<S,false> : integral_constant<ty,default> {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this macro causes Boost.Waveto throw exceptions, find out why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nipunG314 does it still throw exceptions?
NBL_GENERATE_GET_OR_DEFAULT(nonCoherentAtomSize, uint16_t, 0); | ||
|
||
NBL_GENERATE_GET_OR_DEFAULT(subgroupSize, uint16_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(subgroupOpsShaderStages, core::bitflag<asset::IShader::E_SHADER_STAGE, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obviously there's no bitflag on hlsl, so use a plain uint8_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a small annoying note, not every bitflag is a uint8_t
maybe see if you can make a nbl::hlsl::bitflag
thats similar to core::bitflag
except without constexpr
?
Else if its not possible/usable, you could move the enums to .hlsl
headers, e.g.
namespace nbl
{
namespace hlsl
{
enum ShaderStage : uint8_t
{
...
};
}
}
then add a using E_SHADER_STAGE = ShaderStage;
alias in place of the original enum
NBL_GENERATE_GET_OR_DEFAULT(allowCommandBufferQueryCopies, bool, false); | ||
NBL_GENERATE_GET_OR_DEFAULT(maxOptimallyResidentWorkgroupInvocations, uint32_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(maxResidentInvocations, uint32_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(spirvVersion, asset::CGLSLCompiler::E_SPIRV_VERSION, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might need to make that enum global and then alias or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here. Can't we just replace the enum with an integral type like the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is what I mean
#643 (comment)
Need a matching modification to |
Use uint32_t instead of VkExtent2D Use vector types for small arrays Replace core::bitflag<> with uint8_t
…into nahim_shader_defines
NBL_GENERATE_MEMBER_TESTER(largePoints); | ||
NBL_GENERATE_MEMBER_TESTER(shaderCullDistance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alphaToOne is missing
NBL_GENERATE_MEMBER_TESTER(lineWidthGranularity); | ||
|
||
NBL_GENERATE_MEMBER_TESTER(strictLines); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add standard sample locations
NBL_GENERATE_MEMBER_TESTER(shaderUniformBufferArrayNonUniformIndexingNative); | ||
NBL_GENERATE_MEMBER_TESTER(shaderSampledImageArrayNonUniformIndexingNative); | ||
NBL_GENERATE_MEMBER_TESTER(shaderStorageImageArrayNonUniformIndexingNative); | ||
NBL_GENERATE_MEMBER_TESTER(shaderInputAttachmentArrayNonUniformIndexingNative); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing shaderStorageBufferArrayNonUniformIndexingNative
I think
NBL_GENERATE_GET_OR_DEFAULT(maxBufferViewTexels, uint32_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(maxUBOSize, uint32_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(MaxSSBOSize, uint32_t, 0); | ||
NBL_GENERATE_GET_OR_DEFAULT(maxPushConstantsSize, uint32_t, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pay attention to match the types in SPhysicalDeviceLimits.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch, also the default values... they can't all be 0
We'll eventually write a JSON + python based generator for these three headers
"headerGuard": "_NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_", | ||
"includePath": "nbl/video/", | ||
"includes": [ | ||
"ECommonEnums.h" | ||
], | ||
"namespace": "nbl::video", | ||
"structName": "SPhysicalDeviceFeatures", | ||
"structComment": [ | ||
"Usage of feature API", | ||
"## LogicalDevice creation enabled features shouldn't necessarily equal the ones it reports as enabled (superset)", | ||
"**RARE: Creating a physical device with all advertised features/extensions:**", | ||
"auto features = physicalDevice->getFeatures();", | ||
"", | ||
"ILogicalDevice::SCreationParams params = {};", | ||
"params.queueParamsCount = ; // set queue stuff", | ||
"params.queueParams = ; // set queue stuff", | ||
"params.enabledFeatures = features;", | ||
"auto device = physicalDevice->createLogicalDevice(params);", | ||
"**FREQUENT: Choosing a physical device with the features**", | ||
"IPhysicalDevice::SRequiredProperties props = {}; // default initializes to apiVersion=1.1, deviceType = ET_UNKNOWN, pipelineCacheUUID = '\\0', device UUID=`\\0`, driverUUID=`\\0`, deviceLUID=`\\0`, deviceNodeMask= ~0u, driverID=UNKNOWN", | ||
"// example of particular config", | ||
"props.apiVersion = 1.2;", | ||
"props.deviceTypeMask = ~IPhysicalDevice::ET_CPU;", | ||
"props.driverIDMask = ~(EDI_AMD_PROPRIETARY|EDI_INTEL_PROPRIETARY_WINDOWS); // would be goot to turn the enum into a mask", | ||
"props.conformanceVersion = 1.2;", | ||
"", | ||
"SDeviceFeatures requiredFeatures = {};", | ||
"requiredFeatures.rayQuery = true;", | ||
"", | ||
"SDeviceLimits minimumLimits = {}; // would default initialize to worst possible values (small values for maximum sizes, large values for alignments, etc.)" | ||
], | ||
"content": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't have this stuff in the json and python...
Nobody is stopping you from making a SPhysicalDeviceFeatures.h
that just looks like
#ifndef __NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_
#define __NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_
... all the includes ...
// the big comment
struct SPhysicalDeviceLimits.h
{
#include "generated/SPhysicalDeviceLimits_members.h" // or some other appropriate path as per @AnastaZIuk
};
#endif
it will be much saner
"[DO NOT EXPOSE] just reserved numbers, extension never shipped", | ||
"VK_NV_extension_152", | ||
"[ALIASED TO] VK_AMD_mixed_attachment_samples", | ||
"VK_NV_framebuffer_mixed_samples", | ||
"[DO NOT EXPOSE] For now. For 2D ui", | ||
"VK_NV_fill_rectangle", | ||
"[EXPOSE AS A LIMIT] Enabled by Default, Moved to Limits", | ||
"ShaderSMBuiltinsFeaturesNV", | ||
"VK_NV_shader_sm_builtins", | ||
"Enabled by Default, Moved to Limits", | ||
"VK_EXT_post_depth_coverage", | ||
"[DEPRECATED] Vulkan 1.1 Core and ROADMAP 2022", | ||
"SamplerYcbcrConversionFeaturesKHR", | ||
"VK_KHR_sampler_ycbcr_conversion", | ||
"[DEPRECATED] Core 1.1 implemented on default path and there's no choice in not using it", | ||
"VK_KHR_bind_memory2", | ||
"[DO NOT EXPOSE] Too ", | ||
"intrinsically", | ||
"linux\"\"VK_EXT_image_drm_format_modifier", | ||
"[DO NOT EXPOSE] just reserved numbers, extension never shipped", | ||
"VK_EXT_extension_160", | ||
"[TODO LATER] Expose when we start to experience slowdowns from validation", | ||
"VK_EXT_validation_cache", | ||
"[DEPRECATED] KHR extension supersedes and then Vulkan 1.2", | ||
"VK_EXT_descriptor_indexing", | ||
"[DO NOT EXPOSE] We don't support yet", | ||
"VK_EXT_shader_viewport_index_layer", | ||
"VK_KHR_portability_subset - PROVISINAL/NOTAVAILABLEANYMORE", | ||
"[DO NOT EXPOSE] not implementing or exposing VRS in near or far future, also has interactions with fragment density maps", | ||
"ShadingRateImageFeaturesNV", | ||
"VK_NV_shading_rate_image", | ||
"[DEPRECATED] Superseded by KHR", | ||
"VK_NV_ray_tracing", | ||
"RepresentativeFragmentTestFeaturesNV", | ||
"VK_NV_representative_fragment_test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats broken down in a super weird way, I'd rather have them as hidden variables (no export) with visible comments
"headerGuard": "_NBL_VIDEO_S_PHYSICAL_DEVICE_LIMITS_H_INCLUDED_", | ||
"includePath": "nbl/asset/", | ||
"includes": [ | ||
"utils/CGLSLCompiler.h", | ||
"IImage.h", | ||
"IRenderpass.h" | ||
], | ||
"stlIncludes": [ | ||
"type_traits" | ||
], | ||
"namespace": "nbl::video", | ||
"structName": "SPhysicalDeviceLimits", | ||
"structComment": [ | ||
"Struct is populated with Nabla Core Profile Limit Minimums" | ||
], | ||
"content": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as for the feature struct
"type": "function", | ||
"function": [ | ||
"// we don't compare certain capabilities because they don't mean better/worse", | ||
"inline bool isSubsetOf(const SPhysicalDeviceLimits& _rhs) const", | ||
"{", | ||
" if (maxImageDimension1D > _rhs.maxImageDimension1D) return false;", | ||
" if (maxImageDimension2D > _rhs.maxImageDimension2D) return false;", | ||
" if (maxImageDimension3D > _rhs.maxImageDimension3D) return false;", | ||
" if (maxImageDimensionCube > _rhs.maxImageDimensionCube) return false;", | ||
" if (maxImageArrayLayers > _rhs.maxImageArrayLayers) return false;", | ||
" if (maxBufferViewTexels > _rhs.maxBufferViewTexels) return false;", | ||
" if (maxUBOSize > _rhs.maxUBOSize) return false;", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be missing from the JSON, and the body should be autogenerated by python by just knowing the type of the member variable you declared above.
If you manually neeed to maintain this then we're back to square 1 this becomes a handwritten header but worse
"// utility functions", | ||
"// In the cases where the workgroups synchronise with each other such as work DAGs (i.e. `CScanner`),", | ||
"// `workgroupSpinningProtection` is meant to protect against launching a dispatch so wide that", | ||
"// a workgroup of the next cut of the DAG spins for an extended time to wait on a workgroup from a previous one.", | ||
"inline uint32_t computeOptimalPersistentWorkgroupDispatchSize(const uint64_t elementCount, const uint32_t workgroupSize, const uint32_t workgroupSpinningProtection=1u) const", | ||
"{", | ||
" assert(elementCount!=0ull && \"Input element count can't be 0!\");", | ||
" const uint64_t infinitelyWideDeviceWGCount = (elementCount-1ull)/(static_cast<uint64_t>(workgroupSize)*static_cast<uint64_t>(workgroupSpinningProtection))+1ull;", | ||
" const uint32_t maxResidentWorkgroups = maxResidentInvocations/workgroupSize;", | ||
" return static_cast<uint32_t>(core::min<uint64_t>(infinitelyWideDeviceWGCount,maxResidentWorkgroups));", | ||
"}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you follow the
Some handwritten code
struct MyStruct
{
#include "python_generated.h"
More handwritten code
};
Some more handwritten code
pattern, you can add some handwritten members/methods to the struct.
# JSON | ||
${NBL_ROOT_PATH}/src/nbl/video/device_limits.json | ||
${NBL_ROOT_PATH}/src/nbl/video/device_features.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AnastaZIuk is this even a right thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course it's invalid, currently they seem to be inputs for combined gen.py and DeviceGen.py python scripts, they should be handled with custom target + custom command pair, those json files should be flagged as DEPENDS
of the custom command. I guess output locations are known as well - they should be passed to the custom command's OUTPUT
and custom target's DEPENDS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also handled with python or not - those scripts should never hardcode location to the inputs like do do currently, the script should take an input and expected output + be executed by a custom command which prepares the execution of the script given correct arguments for the script execution
Description
Testing
TODO list: