New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variance and covariance inconsistency #17086
Labels
Comments
mattip
added
33 - Question
Question about NumPy usage or development
50 - Duplicate
labels
Aug 15, 2020
Thanks, Matti.
From: Matti Picus <notifications@github.com>
Sent: Saturday, August 15, 2020 1:53 PM
To: numpy/numpy <numpy@noreply.github.com>
Cc: WANG, Tan <tanwang@saif.sjtu.edu.cn>; Author <author@noreply.github.com>
Subject: Re: [numpy/numpy] variance and covariance inconsistency (#17086)
Duplicate of #5835<#5835>, and documented in the note to var<https://numpy.org/devdocs/reference/generated/numpy.var.html> (about 2/3 down the page)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#17086 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQTW5B3BS23FSJIKVXKIBWTSA3YUDANCNFSM4P7YSYCA>.
|
Dear Matti,
Could I both you with another problem I encountered. I reported this to docs@python.org<mailto:docs%40python.org> but got the bounce-back:
python.org suspects your message is spam and rejected it
Here is the problem:
import math
math.ceil(4*0.20*1000)
produces 800, but
math.ceil(3*0.20*1000)
produces 601.
This is obviously incorrect.
Thanks.
Tan
From: Matti Picus <notifications@github.com>
Sent: Saturday, August 15, 2020 1:53 PM
To: numpy/numpy <numpy@noreply.github.com>
Cc: WANG, Tan <tanwang@saif.sjtu.edu.cn>; Author <author@noreply.github.com>
Subject: Re: [numpy/numpy] variance and covariance inconsistency (#17086)
Duplicate of #5835<#5835>, and documented in the note to var<https://numpy.org/devdocs/reference/generated/numpy.var.html> (about 2/3 down the page)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#17086 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQTW5B3BS23FSJIKVXKIBWTSA3YUDANCNFSM4P7YSYCA>.
|
By the way,
np.ceil(3*0.20*1000)
gives 601.0
From: Matti Picus <notifications@github.com>
Sent: Saturday, August 15, 2020 1:53 PM
To: numpy/numpy <numpy@noreply.github.com>
Cc: WANG, Tan <tanwang@saif.sjtu.edu.cn>; Author <author@noreply.github.com>
Subject: Re: [numpy/numpy] variance and covariance inconsistency (#17086)
Duplicate of #5835<#5835>, and documented in the note to var<https://numpy.org/devdocs/reference/generated/numpy.var.html> (about 2/3 down the page)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#17086 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQTW5B3BS23FSJIKVXKIBWTSA3YUDANCNFSM4P7YSYCA>.
|
Floating point numbers are tricky. See this discussion.
|
Thanks!
From: Matti Picus <notifications@github.com>
Sent: Saturday, August 15, 2020 2:47 PM
To: numpy/numpy <numpy@noreply.github.com>
Cc: WANG, Tan <tanwang@saif.sjtu.edu.cn>; Author <author@noreply.github.com>
Subject: Re: [numpy/numpy] variance and covariance inconsistency (#17086)
Floating point numbers are tricky. See this discussion<https://softwareengineering.stackexchange.com/questions/101163/what-causes-floating-point-rounding-errors>.
>> (3 * 0.2) * 1000.0
600.0000000000001
>> 3 * (0.2 * 1000.0)
600.0
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#17086 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQTW5BZBV7NMEG7XJ3UNMO3SA365BANCNFSM4P7YSYCA>.
|
closing as a duplicate |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This may be trivial problem, but there seems to be inconsistency between variance and covariance
import numpy as np
x = [1,2,3,4,5,6,7]
np.var(x)
produces 4.0, but then
np.cov(x,x)
produces
array([[4.66666667, 4.66666667],
[4.66666667, 4.66666667]])
Of course, mathematically var(x) = cov(x,x). The difference here is clearly because in np.var the sum seems to be divided by n while in np.cov the sum seems to be divided by n - 1. For a naive user without paying attention to the detail, this inconsistency may cause some problem.
The text was updated successfully, but these errors were encountered: