You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I compiled with clang++ and -fopenmp, yet I am having trouble controlling the number of threads. It seems triSYCL launches twice or even three times as many threads as CPU cores, which I suspect is the main reason for the poor performance of my code. changing OMP_NUM_THREADS does not seem to have any effect. Any suggestions?
The text was updated successfully, but these errors were encountered:
If you are not using hierarchical parallelism, by default SYCL assumes that you might have some barriers in the work-group, which means, to support this with OpenMP and a plain C++ compiler, we need 1 thread per work-item... :-(
If you know that you will not be in that case, you can play with
I compiled with clang++ and -fopenmp, yet I am having trouble controlling the number of threads. It seems triSYCL launches twice or even three times as many threads as CPU cores, which I suspect is the main reason for the poor performance of my code. changing OMP_NUM_THREADS does not seem to have any effect. Any suggestions?
The text was updated successfully, but these errors were encountered: