Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute shader results from other threads #1054

Open
redorav opened this issue Jul 27, 2018 · 6 comments
Open

Compute shader results from other threads #1054

redorav opened this issue Jul 27, 2018 · 6 comments
Labels
Feature An improvement or feature Unresolved Waiting for a fix or implementation

Comments

@redorav
Copy link

redorav commented Jul 27, 2018

Description

This might be related to #1038 It seems like compute shader threads that depend on the values read from other threads can't be debugged properly. For instance, if I have a compute shader that sets up a groupshared value like so:

groupshared uint gs_variable1 = 0;

[numthreads(8, 1, 1)]
void MyShader(uint groupIndex : SV_GroupIndex)
{
    if(groupIndex == 0)
    {
        gs_variable1 = g_Buffer.Load( // ...
    }
    GroupMemoryBarrierWithGroupSync(); // Wait

    // Other threads do things using that value
}

the value of gs_variable1 is correct for thread (0, 0, 0) but not for the rest of the threads, which don't go through that codepath, meaning the rest of the execution can't be debugged.

Environment

  • RenderDoc build: 1.1
  • Operating System: Windows 7 64 bits
  • API: D3D11
@baldurk
Copy link
Owner

baldurk commented Jul 27, 2018

Yeh as I mentioned in #1038 the simulation of compute shaders is done with only a single thread running, so you'll never get the results of any other thread's execution.

The main problem is you have a huge multiplying effect on the time taken to simulate. Even if I multithreaded the execution that's only a small improvement.

Potentially it's something I could add behind an opt-in toggle for people who want to pay the cost but I'm unsure of how useful it would be if it takes ages to run... I don't like adding options to let users shoot themselves in the foot.

@redorav
Copy link
Author

redorav commented Jul 27, 2018

It would be useful to me, that's for sure. However I understand it might not be a very commonly requested feature.

As a workaround is there a way we could edit the value of a register to fake the other thread's execution? Meaning I could jump to where I want to debug, then set the register I know is supposed to contain a certain value and proceed from there. It would be a bit manual and only realistic for certain scenarios but definitely better than not being able to debug at all.

@baldurk
Copy link
Owner

baldurk commented Jul 27, 2018

Someone has asked for that before to be able to test out shader fixes. The difficulty is as described in the other bug the full simulation is run at once rather than in incremental steps, so it's not easy to inject in new values.

Right now the higher priority to spend time on in this area is supporting debugging at all for new APIs.

@redorav
Copy link
Author

redorav commented Jul 27, 2018

Yeah I would agree with you it should be the focus. Thanks a lot for taking the time to clarify in any case :)

@baldurk baldurk added Feature An improvement or feature Unresolved Waiting for a fix or implementation labels Jul 30, 2018
@bredbored
Copy link
Contributor

I have an implementation of simulating all the threads in the thread group across all available cpu cores that I made when I needed to debug a shader with groupshared memory.

It can take a while with a large group size, but it's time well spent if you're trying to understand what your compute shader is doing!

It only simulates a single thread group, so if threads in another group can write to a location that the simulated group reads then that won't be reflected. Ideally it would detect cases that won't be simulated correctly, but it doesn't at the moment.

It works by blocking at each thread sync instruction until all threads have arrived, and then atomic instructions are inside critical sections.

It uses deferred contexts, which turned out to be a bit of nuisance because I have to turn off the single threaded device bit.

It still puts up the cancel dialog if it runs for a certain number of instructions. Ideally I think it would have a cancel button next to the "thinking" indicator that you could press at any time, but I haven't done that.

@KondeU
Copy link

KondeU commented Sep 19, 2022

Thanks for @bredbored 's implementation, it was very helpful for me to debug my compute shader :-) And are there any official plans to support it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature An improvement or feature Unresolved Waiting for a fix or implementation
Projects
None yet
Development

No branches or pull requests

4 participants