Infinite Values for norm2 #35

ilyakorsunsky · 2018-04-27T22:07:39Z

Hi, for my larger datasets (250,000 x 2000) run with the R code (fastpath=FALSE), I run into the problem that some of the data structures (e.g. V) get so large that the L2 norm (norm2) gets infinite. Then I get errors comparing R and S and eps2, because R or S are infinite. I fixed this problem by scaling by the max of the vector before doing the L2 scaling (example below). Then the code runs to completion. However, I get really large results (e.g. max d is 1e150), which don't match the C implementation. I suspect these large values are themselves the result of a bug.

  V[, 1] <- max_scale(V[, 1])
  V[, 1] <- V[, 1] / norm2(V[, 1])

This may be a related issue: when I ran the C version (fastpath=TRUE) yesterday on the same data, I got the error message "BLAS/LAPACK routine 'DLASCL' gave error code -4". It seems that this error arises when there are NA or INF values in the original matrix. I wonder if this error can also arise from INF values of L2 norm computation. Strangely, I run the same thing today and don't get this error, so if this is not an issue others have, please ignore.

Thanks for looking into this!

The text was updated successfully, but these errors were encountered:

bwlewis · 2018-04-27T22:28:59Z

thankd for this... will investigate.

…

On Fri, Apr 27, 2018, 18:07 ilyakorsunsky ***@***.***> wrote: Hi, for my larger datasets (250,000 x 2000) run with the R code (fastpath=FALSE), I run into the problem that some of the data structures (e.g. V) get so large that the L2 norm (norm2) gets infinite. Then I get errors comparing R and S and eps2, because R or S are infinite. I fixed this problem by scaling by the max of the vector before doing the L2 scaling (example below). Then the code runs to completion. However, I get really large results (e.g. max d is 1e150), which don't match the C implementation. I suspect these large values are themselves the result of a bug. V[, 1] <- max_scale(V[, 1]) V[, 1] <- V[, 1] / norm2(V[, 1]) This may be a related issue: when I ran the C version (fastpath=TRUE) yesterday on the same data, I got the error message "BLAS/LAPACK routine 'DLASCL' gave error code -4". It seems that this error arises when there are NA or INF values in the original matrix. I wonder if this error can also arise from INF values of L2 norm computation. Strangely, I run the same thing today and don't get this error, so if this is not an issue others have, please ignore. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#35>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAIsng7zROois-xSOZXUPGz95SmGUnHEks5ts5asgaJpZM4TrE2C> .

bwlewis · 2019-02-05T04:29:18Z

Yes indeed, I can replicate these behaviors with badly scaled data due to floating point overflow. For example:

x = rep(sqrt(.Machine$double.xmax) * 10, 2)
# now its 2-norm:
sqrt(drop(crossprod(x)))
[1] Inf

however I have not been able to cook up a toy example that illustrates significant differences between the R and C code paths yet.

In any case, I don't yet have a great solution. Am open to ideas!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite Values for norm2 #35

Infinite Values for norm2 #35

ilyakorsunsky commented Apr 27, 2018 •

edited

bwlewis commented Apr 27, 2018 via email

bwlewis commented Feb 5, 2019

Infinite Values for norm2 #35

Infinite Values for norm2 #35

Comments

ilyakorsunsky commented Apr 27, 2018 • edited

bwlewis commented Apr 27, 2018 via email

bwlewis commented Feb 5, 2019

ilyakorsunsky commented Apr 27, 2018 •

edited