Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

d3d11 unreliable behavior with ReadFirstLane--results appear to depend on random compiler optimizations #29

Open
Niadb opened this issue Jul 18, 2019 · 3 comments

Comments

@Niadb
Copy link

Niadb commented Jul 18, 2019

I'm seeing very unreliable behavior when trying to use AmdDxExtShaderIntrinsics_ReadfirstlaneU

If it works or not appears to depend on essentially random changes to the shader.

This is very unfortunate as when it does work, the performance benefit compared to not using ReadfirstLane is significant--45fps vs ~60fps.

*When it fails, the result of ReadfirstLaneU seems very random, almost like it isn't reading valid data, and whatever it is reading doesn't seem to be something that was contained in any of the lanes..

At a guess I'd say this is related to the UAV "hack" used to bypass the D3D shader compiler, perhaps it is sometimes optimizing the output such that AGS can't find its hooks?

The shader in question is very large and a number of other AGS features used.

@gareththomasamd
Copy link
Contributor

Hi, are you definitely using /Ges (enable strictness) and not using /Od (skip debug opts)?

@Niadb
Copy link
Author

Niadb commented Jul 18, 2019

hi gareth, I was compiling with:

shader_compile_flags = D3D10_SHADER_OPTIMIZATION_LEVEL3 ;

I tried adding the strictness flag:

shader_compile_flags = D3D10_SHADER_OPTIMIZATION_LEVEL3 | D3DCOMPILE_ENABLE_STRICTNESS;

But it doesn't seem to change the result, ReadFirstLane still produces noise

@Niadb
Copy link
Author

Niadb commented Jul 19, 2019

I'll try and provide some more info in case it helps:

  • I'm calling ReadfirstlaneU on 3 params in the PS, they were passed in from the VS.
  • it is declared as nointerpolation float4 in the struct.
  • In PS call asuint() on .yzw and scalarize as uint3
  • This data was fed in to the VS as instance data so it is constant for the draw call
  • The data is light/shadow offsets, and bit masks. Shader loops through each active bit in the mask.
  • The shader also uses AGS barycentrics,dynamic vertex pull, and min3
  • the uint3 isn't used immediately, as the shader has to do bunch of computation/texture fetches first
  • In one version of the shader where ReadFirstLane is working correctly, it has some additional code that does use the uint3 right after it is extracted(it also uses it later, as the other variants, but now it magically works).

If there was a way to get the generated assembly from the driver, instead of having to use awkward tools that may not even produce the same result as the game, that would probably be very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants