Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dot verification fails with single precision #20

Open
jrprice opened this issue Dec 13, 2016 · 10 comments
Open

Dot verification fails with single precision #20

jrprice opened this issue Dec 13, 2016 · 10 comments

Comments

@jrprice
Copy link
Contributor

jrprice commented Dec 13, 2016

We probably just need to increase the tolerance. The error will also be proportional to the size of the arrays (unlike with the other kernels), so we need to make sure whatever error checking tolerance we use is robust enough to avoid these sorts of false positives for any sort of input.

Validation failed on sum. Error 0.000209808
Sum was 39.7910385131836 but should be 39.7912483215332
@tomdeakin
Copy link
Contributor

We currently check that the sum array is within 1.0E-8 of the expected value for doubles and floats. We could either:

  1. Use 1.0E-5 for for floats and 1.0E-8 for doubles
  2. Factor in the array size somehow

Option 1 is simple but might hide errors. If the arrays contain correct values, then as long as the reduction is close for this benchmark it might be suitable.
Option 2 might be hard to quantify how we bias the array size.

@Srinivasuluch
Copy link

Would it be possible to make the "sum" has double datatype irrespective of input args, "double or float" so that it gives accurate results.
The "sum" is an user data type used for comparison with "glodSum" value so it should not be matter.
I mean, change the "Template sum" to "double sum"

@tomdeakin
Copy link
Contributor

For devices which do not support double precision would this not pose a problem?

@zyzzyxdonta
Copy link

Hi,
is this issue still being worked on?

@tomdeakin
Copy link
Contributor

Yes, but we've not come up with a satisfactory solution yet.

@zyzzyxdonta
Copy link

Thanks for your reply. Am I right in assuming that despite the verification failing, my measurements are still valid?

@tomdeakin
Copy link
Contributor

If it's just the reduction that fails (dot), and the other kernels are OK then the contents of the arrays should be correct. If the result is close enough on inspection but fails because of the tolerance then it's probably fine. If the result is 0.0 or some other nonsense number then it might have done something really wrong...

@zyzzyxdonta
Copy link

Alright, thanks a lot!

@tomdeakin
Copy link
Contributor

@zjin-lcf suggested using different tolerances for the reduction result based on the data type (option 1 above).

@tomdeakin
Copy link
Contributor

Whilst reviewing #186 we discussed the fact that the goldSum value is computed numerically exactly using multiplication rather than repeated addition. I wonder if there is a way to get error bounds on the difference in algorithm from the FP rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants