New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multireduce Kernels: Allow IF blocks to terminate early #4457
base: master
Are you sure you want to change the base?
Conversation
This diff suggests:
Is there a way to isolate this behavior in |
i can experiment with it, do you want me to move to draft in the meantime? |
so, since IF statements aren't dependencies like loop we will have to change something in the linearizer. but we will have to change the linearizer anyways to allow the results of one reduceop to be loaded back into every thread for a potential next reduceop. general pattern is:
I've drafted both of these systems locally b/c they aren't that complex:
|
UOps always form a graph. It's in the design of UOps.END* to be childless/graph breaking. |
This branch currently is behind tinygrad/master. The line count difference bot is disabled. |
nice, is this ready to test end-to-end? Can you merge it with #4259 |
I can, I like this more: the seems a bit more robust? |
https://github.com/tinygrad/tinygrad/pull/4259/files#r1594119777 regardless it will need this chunk of code to test end-to-end.Changes have been made (like getting rid of the explicit |
@0xtimmy Can you make it works with the BARRIER one 18641c9 There are two merge blockers for this:
|
this pr modifies
UOpGraph.add_ends()
so that it doesn't double add ENDIFs inserted by the linearizerthis will let the linearizer terminate if blocks early when it has to render another reduceop
ex. in standard deviation
the linearizer will have to also put the ENDIF in a
vin
so it doesn't get optimized away byUOpGraph.remove_childless()
. I think putting it in the vin of the barrier makes the most sense