Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in OpenCL kernel #3535

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

frasercrmck
Copy link

Description

Without the barrier at the end of barrierOR, it is possible for work-item 0 to start the next loop iteration and update predicates[0] while other work-items are still inside barrierOR reading predicates, meaning they read the next loop iteration's exit condition. This results in a divergent loop, where not all work-items reach the same barriers.

A previous fix identified this as a problem only on NVIDIA platforms, but strictly speaking a barrier is required in all cases to avoid a spec violation and undefined behaviour.

Changes to Users

The kernel should produce correct results on more OpenCL implementations.

Locally I tested both Intel(R) FPGA Emulation Device and various oneAPI Construction Kit devices, which all previously failed the confidence_connected_opencl --gtest_filter="SingleSeed/ConfidenceConnectedDataTest.SegmentARegion/_prefix_background_radius_0_multiplier_1_iterations_5_replace_255" unit test.

I'm unable to test other OpenCL implementations, sorry.

Checklist

  • Rebased on latest master
  • Code compiles
  • Tests pass
  • [ ] Functions added to unified API
  • [ ] Functions documented

Without the barrier at the end of barrierOR, it is possible for
work-item 0 to start the next loop iteration and update predicates[0]
while other work-items are still inside barrierOR reading `predicates`,
meaning they read the next loop iteration's exit condition. This results
in a divergent loop, where not all work-items reach the same barriers.

A previous fix identified this as a problem only on NVIDIA platforms,
but strictly speaking a barrier is required in all cases to avoid a spec
violation and undefined behaviour.
@umar456
Copy link
Member

umar456 commented Feb 21, 2024

Took me a bit to figure out the problem but I see the issue now. The we can ignore the errors in the CI because they are not related. I will test it on a couple of other systems before merge this PR. Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants