New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swap ordering of thread configuration in Sycl #1628
Conversation
@rhornung67 , did you want to take a look at this PR? Or should I just merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some notes about this in the user guide. Also, did you notice any performance differences with this change?
@rhornung67 I just pushed up a note on the docs. Please let me know if any adjustments are needed. I held off on commenting about kernel as we need to double check and we can follow up after. I also don't have a sense on performance changes as I had not looked at timings and the kernels in the raja perf suite were generating incorrect answers prior to this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor doc changes.
Co-authored-by: Rich Hornung <hornung1@llnl.gov>
Co-authored-by: Rich Hornung <hornung1@llnl.gov>
@artv3 if you don't want to keep the branch, please delete it. |
Sycl uses the outermost index as the fastest, this change enables consistency when using RAJA::launch with other device back ends.
For reference this page has a good description on thread indexing using sycl:
https://www.intel.com/content/www/us/en/docs/dpcpp-compatibility-tool/developer-guide-reference/2023-2/cuda-and-sycl-programming-model-comparison.html
See the Thread Indexing section.