Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fuzzer for sliding window schedules #8144

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

abadams
Copy link
Member

@abadams abadams commented Mar 7, 2024

Works around issue #8140

It would test hoist storage too, but that's disabled due to issue #8141

Works around issue #8140

It would test hoist storage too, but that's disabled due to issue #8141
@abadams
Copy link
Member Author

abadams commented Mar 8, 2024

Interestingly this broke a few tests because they now do fewer evaluations than expected, so it seems the old sliding window was better in those cases. However it also broke camera pipe because it can't fold some storage as aggressively, so there must also be cases where it's worse (or maybe it just makes the bounds expressions more complex so storage folding gets more conservative).

@dsharlet
Copy link
Contributor

dsharlet commented Mar 8, 2024

I think both of those changes are expected, and are not necessarily better.

The old way of sliding window will always do the entire warmup in one production, while the new way would do it one line at a time, just like the steady state. For an N point stencil, this is N times as many calls, but I wouldn't consider it an improvement.

It's expected that the old way will fail to fold storage as tightly, which is why we need the new way for devices with a constrained amount of memory for intermediate buffers. Consider two 3 point stencils in a row: the old method will want to do the warmup entirely on the first iteration, which requires computing 5 elements at once, so we can't fold into less than 5 elements. However, the new method will be able to warm up one line at a time, as required by each line of the second stencil, so we can fold into 3 elements. Point being, it's not because the bounds expressions are worse or some unrelated thing that could be fixed, it's an inherent property of the two warmup strategies.

@abadams
Copy link
Member Author

abadams commented Mar 8, 2024

Oh no, the sliding_reduction test is now doing too few evaluations and producing the wrong output.

Sliding window was broken before, too 😭

Edit: Actually it's doing the right amount of evaluations of the stage with the call counter in it, so that number should indeed be 42, not 48. However it's running the prior update stage (which doesn't have a call counter attached) too few times.

@abadams
Copy link
Member Author

abadams commented Mar 8, 2024

I understand your storage folding point, but not the other one. Why is it N times as many evaluations of the producer stage? I'm not counting loop iterations here. Shouldn't it compute each required value of the producer just once in either case?

Oh you're saying it's N times as many productions. I was counting stores, not productions.

@abadams
Copy link
Member Author

abadams commented Mar 8, 2024

I have confirmed that this is broken even when reverting all of SlidingWindow.cpp back to before the change that added loop rewinding. FML

@abadams
Copy link
Member Author

abadams commented Mar 8, 2024

Oh, I think this is actually an instance of #7819. A row of one of first update stages is being skipped, because it's not used by the output (according to the trimmed .min and .max), but an earlier row was clobbered with garbage by a ShiftInwards on the pure definition.

Edit because I was asked about it: This comment only refers to the bug in sliding_reduction. The fuzzer doesn't do any overcompute that would trigger #7819

Overcompute on sliding window stages is problematic if garbage could be
produced in the overcomputed region.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants