Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leading eigenvectors are very dependent on seed when there are many small eigenvals #41

Open
wamiks opened this issue Oct 5, 2018 · 5 comments

Comments

@wamiks
Copy link

wamiks commented Oct 5, 2018

Hi,

Using your code from your RSPectra comparison with eigenvalues like these:

 [1] 1.2010e+05 3.3975e+04 1.5586e+04 8.3248e+03 7.5574e+03 7.0686e+03 6.2953e+03 6.1647e+03 5.5930e+03 5.3994e+03 4.9431e+03         
 [12] 4.6991e+03 4.3388e+03 4.1773e+03 3.9298e+03 3.8293e+03 3.7712e+03 3.7303e+03 3.6607e+03 3.5395e+03 3.4908e+03 3.3796e+03         
 [23] 3.2971e+03 3.2209e+03 3.1451e+03 3.0029e+03 2.8943e+03 2.8009e+03 2.7219e+03 2.6431e+03 9.7071e-10 6.6343e-10 2.8837e-10         
 [34] 1.8375e-10 1.0578e-10 1.0274e-10 7.8556e-11 7.1549e-11 5.6877e-11 4.8820e-11 4.0724e-11 3.5984e-11 3.4707e-11 2.8522e-11         
 [45] 2.5319e-11 2.5076e-11 2.3933e-11 2.3539e-11 2.1689e-11 2.1335e-11 2.0271e-11 1.9699e-11 1.9426e-11 1.8862e-11 1.8385e-11         
 [56] 1.8283e-11 1.7684e-11 1.7657e-11 1.6216e-11 1.5041e-11 1.3888e-11 1.3781e-11 1.2863e-11 1.2862e-11 1.2635e-11 1.2570e-11         
 [67] 1.2540e-11 1.1974e-11 1.1850e-11 1.1368e-11 1.1355e-11 1.1192e-11 1.0941e-11 1.0780e-11 1.0678e-11 1.0511e-11 9.8443e-12         
 [78] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
 [89] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[100] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[111] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[122] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[133] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[144] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[155] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[166] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[177] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[188] 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12 9.8073e-12         
[199] 9.8073e-12 9.8073e-12 9.8073e-12 

for example gives very different leading eigenvectors depending on the random seed. svd and RSpectra seem mostly stable using this.

best,
michael

@bwlewis
Copy link
Owner

bwlewis commented Oct 5, 2018

Sorry, can you give a code example and I will investigate? Thanks.

@wamiks
Copy link
Author

wamiks commented Oct 5, 2018

code is from here: https://bwlewis.github.io/irlba/comparison.html
example 1

@bwlewis
Copy link
Owner

bwlewis commented Oct 5, 2018

Sorry I should be more specific, I don't follow which example you're referring to. There are several examples there and none of them have exactly 200 singular values (some many more, some far less). Can you point me to a specific example? Sorry to be so confused.Thanks...

@wamiks
Copy link
Author

wamiks commented Oct 6, 2018

Sorry, it's from example 1. Use the above set for d and the run the rest twice with 2 different seeds. The 1st eigenvalues from the 2 runs will have low correlation. I know this is probably an edge case, but it might point out some underflow or something in the algo.

@bwlewis
Copy link
Owner

bwlewis commented Feb 5, 2019

Hmmmm. I wrote this test:

ibrary(irlba)                                                                                                                           
f = function(seed)
{
  d = c(1.2010e+05,3.3975e+04,1.5586e+04,8.3248e+03,7.5574e+03,7.0686e+03,6.2953e+03,6.1647e+03,5.5930e+03,5.3994e+03,4.9431e+03,4.6991e+03,4.3388e+03,4.1773e+03,3.9298e+03,3.8293e+03,3.7712e+03,3.7303e+03,3.6607e+03,3.5395e+03,3.4908e+03,3.3796e+03,3.2971e+03,3.2209e+03,3.1451e+03,3.0029e+03,2.8943e+03,2.8009e+03,2.7219e+03,2.6431e+03,9.7071e-10,6.6343e-10,2.8837e-10,1.8375e-10,1.0578e-10,1.0274e-10,7.8556e-11,7.1549e-11,5.6877e-11,4.8820e-11,4.0724e-11,3.5984e-11,3.4707e-11,2.8522e-11,2.5319e-11,2.5076e-11,2.3933e-11,2.3539e-11,2.1689e-11,2.1335e-11,2.0271e-11,1.9699e-11,1.9426e-11,1.8862e-11,1.8385e-11,1.8283e-11,1.7684e-11,1.7657e-11,1.6216e-11,1.5041e-11,1.3888e-11,1.3781e-11,1.2863e-11,1.2862e-11,1.2635e-11,1.2570e-11,1.2540e-11,1.1974e-11,1.1850e-11,1.1368e-11,1.1355e-11,1.1192e-11,1.0941e-11,1.0780e-11,1.0678e-11,1.0511e-11,9.8443e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12,9.8073e-12)
  set.seed(seed)
  N = 3
  tol = 1e-15
  m = length(d)
  u = qr.Q(qr(matrix(rnorm(m ^ 2), m)))
  v = qr.Q(qr(matrix(rnorm(m ^ 2), m)))
  x = u %*% diag(d) %*% t(v)
  s = svd(x)
  l = irlba(x, N, tol=tol)
  abs(crossprod(a$s$u[,1], a$l$u[,1])) - 1
}

and on my laptop I get for 100 seed values from 1...100:

ans = unlist(Map(f, seq(100)))
summary(ans)
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
## -8.882e-16 -8.882e-16 -8.882e-16 -8.882e-16 -8.882e-16 -8.882e-16

Note that singular vectors are unique only up to sign, so perhaps the difference you're seeing is a sign difference? The comparison above takes the sign difference into account.

I do know of other pathologies with irlba, but I'm not sure that this example fits into those cases, unless maybe I missed something...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants