New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix gh-1279, implement tensor.allclose #1343
Conversation
@ndgrigorian This PR superseded gh-1340, as it also ensures that |
1. Removed unused usm_ndarray._clone static C-only method 2. Removed _dispatch* utilities 3. Used direct calls to unary/binary operators in implementation of special methods
Provide cabs private method implementating abs for complex types, paying attention to array-API mandated special values. To work-around gh-1279, use std::hypot to compute value for finite inputs. Compile with -DUSE_STD_ABS_FOR_COMPLEX_TYPES to use std::abs(z) instead of std::hypot(std::real(z), std::imag(z)).
This change provides private method csqrt to evaluate square-root for complex types. It handles special values as mandated by array API. The finite input, it provides its own implementation based on std::hypot and std::sqrt for real types instead of calling std::sqrt on finite input of complex type. Compile with -DUSE_STD_SQRT_FOR_COMPLEX_TYPES to use std::sqrt instead of custom implementation. Cursory performance study suggests that custom implementation is at least not worse than std::sqrt one.
This utility function is based on symmetric check, unlike numpy.allclose, and verifies that abs(x1-x2) < atol + rtol * max(abs(x1), abs(x2)) This way allclose(x1, x2) is symmetric, and allclose(x1,x2) implies allclose(x2, x1).
3f971ff
to
26862b4
Compare
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1343/index.html |
Array API standard conformance tests for dpctl=0.14.6dev2=py310h7bf5fec_9 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev2=py310h7bf5fec_8 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev2=py310h7bf5fec_9 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310h7bf5fec_8 ran successfully. |
Verified that after the changes dpctl is able to execute problematic inputs on machine with Iris Xe running Windows:
|
1. Aligned default values with those of np.allclose 2. Replaced less test with less_equal to align with NumPy.
Also added tests for early exits to improve coverage.
There is a compelling case for another change here: so the implementation would become
We aren't nearly as burdened by backwards compatibility, so it's interesting to consider, as the incorrect cases pointed out in this issue are incorrect here, too.
Downside would be another call to |
It won't be a downside, we would just trade a call to |
@ndgrigorian I have pushed the change to use |
@oleksandr-pavlyk Does it make sense to change other calls to And if so, maybe it should be in |
I do not think so, but even if we do, we should do it in a separate PR. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310h7bf5fec_16 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310h7bf5fec_17 ran successfully. |
This changes builds up on gh-1265 and takes into account queue from the pre-allocated buffer, if provided.
@oleksandr-pavlyk
|
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_23 ran successfully. |
This improves accuracy at extremes of supported range. Use sycl:: namespace ldexp and ilogb to prevent problem with VS 2017 headers.
@ndgrigorian I pushed the change to normalize the scale of the arguments in Perhaps the solution is to use sych functions from |
59a7d11
to
c4312cb
Compare
ilogb would have to pay attention to correctly computing scale of denormal floats, while simpler code suffices. Also use unscaled version in most cases, and scaled version only for very large inputs.
We work around issues with these functions when their implementation is taken from VS 2017 headers on Windows though.
Since you removed other superfluous |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_32 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_34 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_33 ran successfully. |
Reconfirmed that the latest build resolves gh-1279 on Windows machine with Iris Xe integrated GPU:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, thank you @oleksandr-pavlyk
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_34 ran successfully. |
Fixes gh-1279.
dpctl.tensor.abs
anddpctl.tensor.sqrt
for complex types. This closes dpctl.tensor.abs() does not work on Iris Xe on Windows #1279.dpctl.tensor.allclose
.