Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nahim shader defines #643

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Conversation

devshgraphicsprogramming
Copy link
Member

Description

Testing

TODO list:

Comment on lines 484 to 486
#define NBL_GENERATE_GET_OR_DEFAULT(field, ty, default) \
template<typename S, bool = has_member_##field<S>::value> struct get_or_default_##field : integral_constant<ty,S::field> {}; \
template<typename S> struct get_or_default_##field<S,false> : integral_constant<ty,default> {};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this macro causes Boost.Waveto throw exceptions, find out why

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nipunG314 does it still throw exceptions?

NBL_GENERATE_GET_OR_DEFAULT(nonCoherentAtomSize, uint16_t, 0);

NBL_GENERATE_GET_OR_DEFAULT(subgroupSize, uint16_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(subgroupOpsShaderStages, core::bitflag<asset::IShader::E_SHADER_STAGE, 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously there's no bitflag on hlsl, so use a plain uint8_t

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small annoying note, not every bitflag is a uint8_t

maybe see if you can make a nbl::hlsl::bitflag thats similar to core::bitflag except without constexpr ?

Else if its not possible/usable, you could move the enums to .hlsl headers, e.g.

namespace nbl
{
namespace hlsl
{
enum ShaderStage : uint8_t
{
...
};
}
}

then add a using E_SHADER_STAGE = ShaderStage; alias in place of the original enum

NBL_GENERATE_GET_OR_DEFAULT(allowCommandBufferQueryCopies, bool, false);
NBL_GENERATE_GET_OR_DEFAULT(maxOptimallyResidentWorkgroupInvocations, uint32_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(maxResidentInvocations, uint32_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(spirvVersion, asset::CGLSLCompiler::E_SPIRV_VERSION, 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to make that enum global and then alias or something?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here. Can't we just replace the enum with an integral type like the rest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what I mean
#643 (comment)

@devshgraphicsprogramming
Copy link
Member Author

Need a matching modification to nbl/video/CJITIncludeLoader.cpp

Base automatically changed from vulkan_1_3 to master March 11, 2024 17:24
Use uint32_t instead of VkExtent2D
Use vector types for small arrays
Replace core::bitflag<> with uint8_t
Comment on lines +16 to +17
NBL_GENERATE_MEMBER_TESTER(largePoints);
NBL_GENERATE_MEMBER_TESTER(shaderCullDistance);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphaToOne is missing

NBL_GENERATE_MEMBER_TESTER(lineWidthGranularity);

NBL_GENERATE_MEMBER_TESTER(strictLines);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add standard sample locations

Comment on lines +229 to +232
NBL_GENERATE_MEMBER_TESTER(shaderUniformBufferArrayNonUniformIndexingNative);
NBL_GENERATE_MEMBER_TESTER(shaderSampledImageArrayNonUniformIndexingNative);
NBL_GENERATE_MEMBER_TESTER(shaderStorageImageArrayNonUniformIndexingNative);
NBL_GENERATE_MEMBER_TESTER(shaderInputAttachmentArrayNonUniformIndexingNative);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing shaderStorageBufferArrayNonUniformIndexingNative I think

NBL_GENERATE_GET_OR_DEFAULT(maxBufferViewTexels, uint32_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(maxUBOSize, uint32_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(MaxSSBOSize, uint32_t, 0);
NBL_GENERATE_GET_OR_DEFAULT(maxPushConstantsSize, uint32_t, 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pay attention to match the types in SPhysicalDeviceLimits.h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, also the default values... they can't all be 0

We'll eventually write a JSON + python based generator for these three headers

Comment on lines +2 to +33
"headerGuard": "_NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_",
"includePath": "nbl/video/",
"includes": [
"ECommonEnums.h"
],
"namespace": "nbl::video",
"structName": "SPhysicalDeviceFeatures",
"structComment": [
"Usage of feature API",
"## LogicalDevice creation enabled features shouldn't necessarily equal the ones it reports as enabled (superset)",
"**RARE: Creating a physical device with all advertised features/extensions:**",
"auto features = physicalDevice->getFeatures();",
"",
"ILogicalDevice::SCreationParams params = {};",
"params.queueParamsCount = ; // set queue stuff",
"params.queueParams = ; // set queue stuff",
"params.enabledFeatures = features;",
"auto device = physicalDevice->createLogicalDevice(params);",
"**FREQUENT: Choosing a physical device with the features**",
"IPhysicalDevice::SRequiredProperties props = {}; // default initializes to apiVersion=1.1, deviceType = ET_UNKNOWN, pipelineCacheUUID = '\\0', device UUID=`\\0`, driverUUID=`\\0`, deviceLUID=`\\0`, deviceNodeMask= ~0u, driverID=UNKNOWN",
"// example of particular config",
"props.apiVersion = 1.2;",
"props.deviceTypeMask = ~IPhysicalDevice::ET_CPU;",
"props.driverIDMask = ~(EDI_AMD_PROPRIETARY|EDI_INTEL_PROPRIETARY_WINDOWS); // would be goot to turn the enum into a mask",
"props.conformanceVersion = 1.2;",
"",
"SDeviceFeatures requiredFeatures = {};",
"requiredFeatures.rayQuery = true;",
"",
"SDeviceLimits minimumLimits = {}; // would default initialize to worst possible values (small values for maximum sizes, large values for alignments, etc.)"
],
"content": {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't have this stuff in the json and python...

Nobody is stopping you from making a SPhysicalDeviceFeatures.h that just looks like

#ifndef __NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_
#define __NBL_VIDEO_S_PHYSICAL_DEVICE_FEATURES_H_INCLUDED_

... all the includes ...

// the big comment

struct SPhysicalDeviceLimits.h
{
#include "generated/SPhysicalDeviceLimits_members.h" // or some other appropriate path as per @AnastaZIuk 
};

#endif

it will be much saner

Comment on lines +1590 to +1624
"[DO NOT EXPOSE] just reserved numbers, extension never shipped",
"VK_NV_extension_152",
"[ALIASED TO] VK_AMD_mixed_attachment_samples",
"VK_NV_framebuffer_mixed_samples",
"[DO NOT EXPOSE] For now. For 2D ui",
"VK_NV_fill_rectangle",
"[EXPOSE AS A LIMIT] Enabled by Default, Moved to Limits",
"ShaderSMBuiltinsFeaturesNV",
"VK_NV_shader_sm_builtins",
"Enabled by Default, Moved to Limits",
"VK_EXT_post_depth_coverage",
"[DEPRECATED] Vulkan 1.1 Core and ROADMAP 2022",
"SamplerYcbcrConversionFeaturesKHR",
"VK_KHR_sampler_ycbcr_conversion",
"[DEPRECATED] Core 1.1 implemented on default path and there's no choice in not using it",
"VK_KHR_bind_memory2",
"[DO NOT EXPOSE] Too ",
"intrinsically",
"linux\"\"VK_EXT_image_drm_format_modifier",
"[DO NOT EXPOSE] just reserved numbers, extension never shipped",
"VK_EXT_extension_160",
"[TODO LATER] Expose when we start to experience slowdowns from validation",
"VK_EXT_validation_cache",
"[DEPRECATED] KHR extension supersedes and then Vulkan 1.2",
"VK_EXT_descriptor_indexing",
"[DO NOT EXPOSE] We don't support yet",
"VK_EXT_shader_viewport_index_layer",
"VK_KHR_portability_subset - PROVISINAL/NOTAVAILABLEANYMORE",
"[DO NOT EXPOSE] not implementing or exposing VRS in near or far future, also has interactions with fragment density maps",
"ShadingRateImageFeaturesNV",
"VK_NV_shading_rate_image",
"[DEPRECATED] Superseded by KHR",
"VK_NV_ray_tracing",
"RepresentativeFragmentTestFeaturesNV",
"VK_NV_representative_fragment_test"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats broken down in a super weird way, I'd rather have them as hidden variables (no export) with visible comments

Comment on lines +2 to +17
"headerGuard": "_NBL_VIDEO_S_PHYSICAL_DEVICE_LIMITS_H_INCLUDED_",
"includePath": "nbl/asset/",
"includes": [
"utils/CGLSLCompiler.h",
"IImage.h",
"IRenderpass.h"
],
"stlIncludes": [
"type_traits"
],
"namespace": "nbl::video",
"structName": "SPhysicalDeviceLimits",
"structComment": [
"Struct is populated with Nabla Core Profile Limit Minimums"
],
"content": {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as for the feature struct

Comment on lines +2330 to +2341
"type": "function",
"function": [
"// we don't compare certain capabilities because they don't mean better/worse",
"inline bool isSubsetOf(const SPhysicalDeviceLimits& _rhs) const",
"{",
" if (maxImageDimension1D > _rhs.maxImageDimension1D) return false;",
" if (maxImageDimension2D > _rhs.maxImageDimension2D) return false;",
" if (maxImageDimension3D > _rhs.maxImageDimension3D) return false;",
" if (maxImageDimensionCube > _rhs.maxImageDimensionCube) return false;",
" if (maxImageArrayLayers > _rhs.maxImageArrayLayers) return false;",
" if (maxBufferViewTexels > _rhs.maxBufferViewTexels) return false;",
" if (maxUBOSize > _rhs.maxUBOSize) return false;",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be missing from the JSON, and the body should be autogenerated by python by just knowing the type of the member variable you declared above.

If you manually neeed to maintain this then we're back to square 1 this becomes a handwritten header but worse

Comment on lines +2316 to +2326
"// utility functions",
"// In the cases where the workgroups synchronise with each other such as work DAGs (i.e. `CScanner`),",
"// `workgroupSpinningProtection` is meant to protect against launching a dispatch so wide that",
"// a workgroup of the next cut of the DAG spins for an extended time to wait on a workgroup from a previous one.",
"inline uint32_t computeOptimalPersistentWorkgroupDispatchSize(const uint64_t elementCount, const uint32_t workgroupSize, const uint32_t workgroupSpinningProtection=1u) const",
"{",
" assert(elementCount!=0ull && \"Input element count can't be 0!\");",
" const uint64_t infinitelyWideDeviceWGCount = (elementCount-1ull)/(static_cast<uint64_t>(workgroupSize)*static_cast<uint64_t>(workgroupSpinningProtection))+1ull;",
" const uint32_t maxResidentWorkgroups = maxResidentInvocations/workgroupSize;",
" return static_cast<uint32_t>(core::min<uint64_t>(infinitelyWideDeviceWGCount,maxResidentWorkgroups));",
"}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you follow the

Some handwritten code

struct MyStruct
{
#include "python_generated.h"

More handwritten code
};

Some more handwritten code

pattern, you can add some handwritten members/methods to the struct.

Comment on lines +323 to +325
# JSON
${NBL_ROOT_PATH}/src/nbl/video/device_limits.json
${NBL_ROOT_PATH}/src/nbl/video/device_features.json
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnastaZIuk is this even a right thing to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course it's invalid, currently they seem to be inputs for combined gen.py and DeviceGen.py python scripts, they should be handled with custom target + custom command pair, those json files should be flagged as DEPENDS of the custom command. I guess output locations are known as well - they should be passed to the custom command's OUTPUT and custom target's DEPENDS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also handled with python or not - those scripts should never hardcode location to the inputs like do do currently, the script should take an input and expected output + be executed by a custom command which prepares the execution of the script given correct arguments for the script execution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants