Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallback to gesvd if gedd LAPACK fails in numpy backend #962

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gefux
Copy link
Contributor

@gefux gefux commented May 10, 2022

Solves: #896

Hi TN people!
Thank you very much for your work so far!

Much like in #896 users of our project OQuPy experience sometimes non-reproducible "SVD did not converge" errors. Very unpleasantly they occur more often with large tensors that are often the result of hours of computation time. I believe most (if not all) of these errors could be avoided by falling back to the _gesvd LAPACK routine in scipy.

Numpy uses the _gesdd LAPACK routine for performing SVDs. In a nutshell: _gesdd routine is for large matrices much faster then the _gesvd routine, but _gesvd is more stabil for badly conditioned matrices.

If one searches for "scipy gesdd SVD did not converge" one finds a lot of issues on github and stackoverflow and 90% of the conversation can be summarised as "I cannot reproduce your/my issue."

In this Issue, @pv (Pauli Virtranen) is pointing out:

Floating-point reproducibility is generally not guaranteed (see e.g. intel compiler manuals) on any platform, even for code in the same program can give different results due to e.g. data alignment, unless you take special care with library and compiler options, as it depends on things such as compiler optimizations. This is not generally considered a bug, and correct code should not assume it. It is a speed/precision tradeoff, and official binaries provided by scipy project do not ensure 100% fp reproducibility.

However, when browsing through the dozens of issues, it seems that they are often resolve when using the _gesvd LAPACK routine (through scipy). Because _gesdd is mostly much faster I think it is reasonable to keep this as the default, but fall back to the _gesvd routine if the _gesdd fails. This is also done in the TeNPy package here.

Following the Zen of Python: "Errors should never pass silently.", I've added a warning to the fall back to gesdd.
Unfortunately, I don't know how to test it, because I don't know how to construct a matrix that will fail reliably with gesdd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant