Trace.remove_response() without nfft_pow2 option slow for certain npts #715

megies · 2014-01-17T16:42:58Z

Dear developers,

when working with large traces (say, nb of data ~ 1,000,000), the rfft inside 'remove_response' method hangs (on my computer at least).

In 'simulate' method there is an option nfft_pow2, which I always use to get away with this.

Is it possible to add this option to remove_response?

Thanks
BG

megies · 2014-01-17T10:08:26Z

Hmm.. when putting together remove_response() I wanted to omit as much clutter as possible and I played with the nfft option a bit and really did not see it making any notable difference..

I just checked again a bit more thoroughly and for me the nfft option slows down the routine significantly:

I am not convinced that reintroducing this option would help..
How did you install ObsPy? Is it possible for you to temporarily install obspy from a test branch? pip install https://github.com/megies/obspy/archive/nfftpow2.zip should do the trick (depending on your Python setup/installation). Please run the following in Ipython then and post the output.

from obspy import read
import numpy as np

np.random.seed(123)
tr = read()[0]

tr.data = np.random.random(int(1e3))
%timeit tr.copy().remove_response(nfft_pow2=True)
%timeit tr.copy().remove_response(nfft_pow2=False)

tr.data = np.random.random(int(1e4))
%timeit tr.copy().remove_response(nfft_pow2=True)
%timeit tr.copy().remove_response(nfft_pow2=False)

tr.data = np.random.random(int(1e5))
%timeit tr.copy().remove_response(nfft_pow2=True)
%timeit tr.copy().remove_response(nfft_pow2=False)

tr.data = np.random.random(int(1e6))
%timeit tr.copy().remove_response(nfft_pow2=True)
%timeit tr.copy().remove_response(nfft_pow2=False)

krischer · 2014-01-17T10:43:19Z

There are some rare cases where numpy's FFT implementation is extremely slow. There is also some discussion on the numpy mailing list about this but I cannot find it right now, here is a github ticket instead:
numpy/numpy#1864

It depends on the number of samples in the data array fed to numpy.ftt.fft(). In that cases the nfft_pow2 option, while slower in most other cases, is much faster.

It probably depends on the machine and installation, but try the following. Note that the number of samples in both examples only differs by one!

>>> import timeit
>>>timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037109))", number=1) 
0.7895288467407227

# This one does not even finish. It hangs inside the timeit function.
# Otherwise it finishes in between 3 and 10 minutes on my machine...
>>>timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037108))", number=1)

A way to work around this, without relying on the nfft_pow2 option is to monkey patch numpy's fft with the implementation provided by pyfftw:

# Untested!
import pyfftw.interfaces
import numpy.fft

numpy.fft = pyfftw.interfaces.numpy_fft

# Turn on the cache for optimum performance
pyfftw.interfaces.cache.enable()

Concluding I would say that having the nfft_pow2 options might be necessary...

megies · 2014-01-17T10:52:45Z

Oh my, this is crazy.. no way to know beforehand for which npts this is in issue, I guess? I would really like to have as little low-level options as possible in here.. :/

megies · 2014-01-17T11:59:03Z

@bgoutorbe, can you find out the exact npts of the traces you had problems with and post them here? It seems the problem might be related to large prime numbers in the factorization of npts. If that is what was causing you problems we could work around the issue internally.

bgoutorbe · 2014-01-17T12:20:47Z

this is exactly as krischer said: on both my home and work computers, rfft acts weird with some particular nb of samples:

timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037109))", number=1)
0.589202880859375
timeit.timeit("import numpy as np; np.fft.rfft(np.empty(1037108))", number=1)
--> does not finish
timeit.timeit("import numpy as np; np.fft.rfft(np.empty(900001))", number=1)
--> does not finish either (and corresponds to the nb of pts in my trace)

...quite crazy :/

So, using megies' implementation:

tr.data = np.random.random(int(1e6))
%timeit tr.copy().remove_response(nfft_pow2=True)
--> 1 loops, best of 3: 18.5 s per loop
%timeit tr.copy().remove_response(nfft_pow2=False)
--> 1 loops, best of 3: 17.5 s per loop

BUT with a cursed number:

tr.data = np.random.random(900001)
%timeit tr.copy().remove_response(nfft_pow2=True)
--> 1 loops, best of 3: 18.4 s per loop
%timeit tr.copy().remove_response(nfft_pow2=False)
--> does not finish at all!

emolch · 2014-01-17T12:21:41Z

From my experience it (zero-padding to pow2 length) can safely and always (or by default) be done inside the remove_response method. I have always done it like that for performance reasons in the pyrocko.trace.Trace.transfer function, which probably works similar.

Only sad that this trick does not work with resampling because of the fixed ratio between the lengths of the forward and backward transforms, so only one of the transforms can be made with a pow2 length.

megies · 2014-01-17T15:41:16Z

I looked into the relation of npts to computation time a bit.. here's what I got:

The baseline of normal non-nextpow2 seems linear to me, but when large primes are involved (in the factorization of npts) then the computation time seems to be quadratic function of npts.

The factorization is fast so I think I'll include a check for large primes in npts and use next power of 2 internally only if we hit a large prime.

bgoutorbe · 2014-01-17T16:23:13Z

@megies, great and extremely interesting work!

in case large primes are encountered in nfft factorization (which makes the computation take forever) we simply switch to using the next power of 2, which is not the fastest possibility but has robust computation time.

megies · 2014-01-17T16:44:12Z

Oops, this has to go into releases not master..

megies · 2014-01-19T15:50:07Z

Here's a few plots (2D histograms) to summarize this issue:

A) Original state of Trace.remove_response() and Trace.simulate() (without specifying nfft_pow2=True) before pull request. Some npts that result in nfft points that have large primes in their factorization have very slow calculation time (some quadratic function, not acceptable for traces with large number of samples; does not finish for some bad values)

B) behavior for Trace.simulate(nfft_pow2=True). Failsafe but up two times slower.

C) If encountering a bad nfft value use next power of 2 (like nfft_pow2 does).

D) new behavior for Trace.remove_response() and Trace.simulate() (when not explicitely specifying nfft_pow2=True -- this behavior is left unchanged). If encountering a bad nfft value try some slightly larger values (10 tries), if no good value is encountered use next power of 2 (like nfft_pow2 does).

)

smart calculation of nfft in Trace.remove_response

4763aca

in case large primes are encountered in nfft factorization (which makes the computation take forever) we simply switch to using the next power of 2, which is not the fastest possibility but has robust computation time.

megies closed this Jan 17, 2014

megies added a commit that referenced this pull request Jan 19, 2014

Merge remote-tracking branch 'megies/smart_nfft' into releases (see #715

707647c

)

megies added a commit that referenced this pull request Jan 19, 2014

Merge branch 'releases' (see #715)

cfeef13

QuLogic added this to the 0.9.2 milestone Oct 20, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace.remove_response() without nfft_pow2 option slow for certain npts #715

Trace.remove_response() without nfft_pow2 option slow for certain npts #715

megies commented Jan 17, 2014

megies commented Jan 17, 2014

krischer commented Jan 17, 2014

megies commented Jan 17, 2014

megies commented Jan 17, 2014

bgoutorbe commented Jan 17, 2014

emolch commented Jan 17, 2014

megies commented Jan 17, 2014

bgoutorbe commented Jan 17, 2014

megies commented Jan 17, 2014

megies commented Jan 19, 2014

Trace.remove_response() without nfft_pow2 option slow for certain npts #715

Trace.remove_response() without nfft_pow2 option slow for certain npts #715

Conversation

megies commented Jan 17, 2014

megies commented Jan 17, 2014

krischer commented Jan 17, 2014

megies commented Jan 17, 2014

megies commented Jan 17, 2014

bgoutorbe commented Jan 17, 2014

emolch commented Jan 17, 2014

megies commented Jan 17, 2014

bgoutorbe commented Jan 17, 2014

megies commented Jan 17, 2014

megies commented Jan 19, 2014