Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite Values for norm2 #35

Open
ilyakorsunsky opened this issue Apr 27, 2018 · 2 comments
Open

Infinite Values for norm2 #35

ilyakorsunsky opened this issue Apr 27, 2018 · 2 comments

Comments

@ilyakorsunsky
Copy link

ilyakorsunsky commented Apr 27, 2018

Hi, for my larger datasets (250,000 x 2000) run with the R code (fastpath=FALSE), I run into the problem that some of the data structures (e.g. V) get so large that the L2 norm (norm2) gets infinite. Then I get errors comparing R and S and eps2, because R or S are infinite. I fixed this problem by scaling by the max of the vector before doing the L2 scaling (example below). Then the code runs to completion. However, I get really large results (e.g. max d is 1e150), which don't match the C implementation. I suspect these large values are themselves the result of a bug.

  V[, 1] <- max_scale(V[, 1])
  V[, 1] <- V[, 1] / norm2(V[, 1])

This may be a related issue: when I ran the C version (fastpath=TRUE) yesterday on the same data, I got the error message "BLAS/LAPACK routine 'DLASCL' gave error code -4". It seems that this error arises when there are NA or INF values in the original matrix. I wonder if this error can also arise from INF values of L2 norm computation. Strangely, I run the same thing today and don't get this error, so if this is not an issue others have, please ignore.

Thanks for looking into this!

@bwlewis
Copy link
Owner

bwlewis commented Apr 27, 2018 via email

@bwlewis
Copy link
Owner

bwlewis commented Feb 5, 2019

Yes indeed, I can replicate these behaviors with badly scaled data due to floating point overflow. For example:

x = rep(sqrt(.Machine$double.xmax) * 10, 2)
# now its 2-norm:
sqrt(drop(crossprod(x)))
[1] Inf

however I have not been able to cook up a toy example that illustrates significant differences between the R and C code paths yet.

In any case, I don't yet have a great solution. Am open to ideas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants