-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite of group algorithms #966
base: develop
Are you sure you want to change the base?
rewrite of group algorithms #966
Conversation
8ccba3e
to
d642301
Compare
This implements has_known_identity and known_identity as well as tests. It also adds plus<void>, multiplies<void>, etc.
This mainly makes vec constexpr, adjusts known_identity to to be able to initialize vec and the tests to be able to check the vec-values.
also added scopedv2 support
d642301
to
94da34a
Compare
HIPSYCL_BUILTIN | ||
T exclusive_scan_over_group(Group g, T x, BinaryOperation binary_op) { | ||
HIPSYCL_RETURN_DISPATCH_GROUP_ALGORITHM(__hipsycl_exclusive_scan_over_group, | ||
g, x, binary_op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did exclusive_scan_over_group
without init
go? The spec says we need to have this overload. I don't see it anymore, am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put all backend independent implementations in line 253+. Those are functions for example without init, which can be implemented using the identity and version with init or where an index just needs to be linearized. The exclusive scan without init for example is in line 378.
f478739
to
f94b856
Compare
The group algorithms were restructured and partially rewritten to make use of the known_identity.
Where possible functions are implemented using other group function variants to reduce code duplication.
This PR also implements functions for scoped parallelism V2 implemented using the ND-range variants.
The tests are also rewritten and only check functions that aren't implemented using others.