Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cov not behaving properly for vector of vectors? #114

Open
dannys4 opened this issue Jun 13, 2022 · 0 comments
Open

cov not behaving properly for vector of vectors? #114

dannys4 opened this issue Jun 13, 2022 · 0 comments

Comments

@dannys4
Copy link

dannys4 commented Jun 13, 2022

I'm not sure if I'm understanding the behavior of cov correctly, but it seems like the current implementation for between two vectors only works on a vector of numbers. It currently gives the dot product between the vectors, which is only equivalent to what the documentation says when the vectors only contain scalars (since it's doing an inner product). At some point before bringing it over from StatsBase.jl, this probably worked according to this comment, but right now it doesn't give the behavior I expect. For example,

julia> N = 1000;

julia> A = [randn(5) for _ in 1:N];

julia> B = [randn(5) for _ in 1:N];

julia> using Statistics

julia> cov(A,B) # This is unexpected
0.15626480407767923

julia> sum((a-mean(A))*(b-mean(B))' for (a,b) in zip(A,B))/(N-1) # This is expected
5×5 Matrix{Float64}:
  9.35299e-16  -5.73004e-16  -4.02748e-16   9.26408e-16  -3.28066e-16
 -7.09476e-16   1.65189e-15   2.7001e-15   -9.89532e-16  -2.72944e-16
  2.08931e-16   3.56516e-16   7.28591e-16   9.61526e-16  -1.33894e-15
  3.51182e-16  -1.68034e-16  -1.23136e-16  -2.57385e-16   9.51302e-16
 -1.97373e-16   5.99676e-16   2.97838e-16  -1.7159e-16   -1.01532e-15

I'm not saying the above is the best possible implementation, just that it gives the answer I expect. I would propose (and be willing to put up a PR for!) a change where the current implementation of unscaled_covzm is specialized to AbstractVector{<:Number} and there's a different general implementation of it for AbstractVector, similar to what I put up above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant