New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace.remove_response() without nfft_pow2 option slow for certain npts #715
Conversation
There are some rare cases where numpy's FFT implementation is extremely slow. There is also some discussion on the numpy mailing list about this but I cannot find it right now, here is a github ticket instead: It depends on the number of samples in the data array fed to It probably depends on the machine and installation, but try the following. Note that the number of samples in both examples only differs by one! >>> import timeit
>>>timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037109))", number=1)
0.7895288467407227
# This one does not even finish. It hangs inside the timeit function.
# Otherwise it finishes in between 3 and 10 minutes on my machine...
>>>timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037108))", number=1) A way to work around this, without relying on the # Untested!
import pyfftw.interfaces
import numpy.fft
numpy.fft = pyfftw.interfaces.numpy_fft
# Turn on the cache for optimum performance
pyfftw.interfaces.cache.enable() Concluding I would say that having the |
Oh my, this is crazy.. no way to know beforehand for which npts this is in issue, I guess? I would really like to have as little low-level options as possible in here.. :/ |
@bgoutorbe, can you find out the exact |
this is exactly as krischer said: on both my home and work computers, rfft acts weird with some particular nb of samples: timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037109))", number=1) ...quite crazy :/ So, using megies' implementation: tr.data = np.random.random(int(1e6)) BUT with a cursed number: tr.data = np.random.random(900001) |
From my experience it (zero-padding to pow2 length) can safely and always (or by default) be done inside the remove_response method. I have always done it like that for performance reasons in the pyrocko.trace.Trace.transfer function, which probably works similar. Only sad that this trick does not work with resampling because of the fixed ratio between the lengths of the forward and backward transforms, so only one of the transforms can be made with a pow2 length. |
@megies, great and extremely interesting work! |
in case large primes are encountered in nfft factorization (which makes the computation take forever) we simply switch to using the next power of 2, which is not the fastest possibility but has robust computation time.
Oops, this has to go into releases not master.. |
Dear developers,
when working with large traces (say, nb of data ~ 1,000,000), the rfft inside 'remove_response' method hangs (on my computer at least).
In 'simulate' method there is an option nfft_pow2, which I always use to get away with this.
Is it possible to add this option to remove_response?
Thanks
BG