Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely long runtimes in numpy.fft.fft (Trac #1266) #1864

Closed
thouis opened this issue Oct 19, 2012 · 19 comments
Closed

Extremely long runtimes in numpy.fft.fft (Trac #1266) #1864

thouis opened this issue Oct 19, 2012 · 19 comments

Comments

@thouis
Copy link
Contributor

thouis commented Oct 19, 2012

Original ticket http://projects.scipy.org/numpy/ticket/1266 on 2009-10-19 by trac user koehler, assigned to unknown.

Although the documenttation of numpy.fft.fft states that:
"This is most efficient for n a power of two. This also stores a cache of working memory for different sizes of fft's, so you could theoretically run into memory problems if you call this too many times with too many different n's." ,I think that it may be important to report this oddity in the fft runtime.
Dependent on the array length, the fft runtime varies really extreme:

[ipython shell, from numpy import *]
In [1]: %time fft.fft(zeros(119516))
CPU times: user 22.83 s, sys: 0.39 s, total: 23.23 s
Wall time: 23.53 s

In [3]: %time fft.fft(zeros(119517))
CPU times: user 36.33 s, sys: 0.08 s, total: 36.40 s
Wall time: 36.51 s

In [5]: %time fft.fft(zeros(119518))
CPU times: user 4.88 s, sys: 0.08 s, total: 4.96 s
Wall time: 5.02 s

In [7]: %time fft.fft(zeros(119519))
CPU times: user 0.45 s, sys: 0.00 s, total: 0.45 s
Wall time: 0.45 s

In [9]: %time fft.fft(zeros(119515))
CPU times: user 0.07 s, sys: 0.00 s, total: 0.08 s
Wall time: 0.08 s

In [11]: %time fft.fft(zeros(119514))
CPU times: user 15.84 s, sys: 0.06 s, total: 15.90 s
Wall time: 15.95 s

In [13]: %time fft.fft(zeros(119513))
CPU times: user 272.75 s, sys: 1.03 s, total: 273.78 s
Wall time: 275.63 s
@numpy-gitbot
Copy link

@endolith wrote on 2009-11-20

Related: #1177 http://projects.scipy.org/scipy/ticket/949

@numpy-gitbot
Copy link

@rgommers wrote on 2011-03-01

@numpy-gitbot
Copy link

@rgommers wrote on 2011-03-01

David C. has implemented the Bluestein transform if you need it: https://github.com/cournape/numpy/tree/bluestein

Should hopefully land in Numpy trunk soon.

@numpy-gitbot
Copy link

Milestone changed to Unscheduled by @mwiebe on 2011-03-25

@juliantaylor
Copy link
Contributor

in this pr a padding to small primes is proposed
scipy/scipy#3144
having the function to get a better padding size might be a useful utility in numpys fftpack

@endolith
Copy link
Contributor

Yes, instead of m = 2 ** nextpow2(2 * n - 1) it will faster to use something like the next_regular function

@smcantab
Copy link

I have also come across this issue using the detect_equilibration function of pymbar that repeatedly calls np.fft and np.ifft through statsmodels autocorrelation function on many increasingly shorter arrays. I found out profiling the code, which has ultimately led me to this thread. The only work around so far is to explicitly call

 np.fft.fftpack._fft_cache.clear()

to make sure that the memory requirement does not grow dangerously. This does not seem to be the ideal solution though. It would be nice to have either a kwarg such as "memsafe=True" and/or a function to manually clear the cache without referring to the global variable explicitely.

@endolith
Copy link
Contributor

@juliantaylor Padding isn't applicable to plain FFTs, correct? Just to convolution/correlation?

@rgommers The Bluestein algorithm does speed up the FFT for prime sizes, as done in scipy/scipy#4288 , but requires pre-computation of complex chirps, and is most efficient when you keep the chirp in memory and repeatedly re-use it on chunks of data. So I'm not sure if this is good for numpy and maybe just defer to scipy.fftpack.czt for this application?

Anyway I think this can be closed as a duplicate of #1177 ? Unless something else like Rader's algorithm is better than Bluestein's?

and @smcantab 's issue is different?

@smcantab
Copy link

@endolith this was a long time ago :) but yes, now that I look at it again it seems a different issue. The problem I reported might still be relevant though and I had to implement the workaround I suggested in pymbar for it to work.

@standarddeviant
Copy link

FYI, I ran in to someone at an audio conference whos showed me this example. It is easy to reproduce and extreme.

%time np.fft.fft( np.random.randn(100000) )
Wall time: 16 ms
Out[16]: 
array([ 196.58599022  +0.j        ,  -88.38483360 +89.2507627j ,
       -166.72250316+339.27161306j, ...,   12.22959535 -64.01621313j,
       -166.72250316-339.27161306j,  -88.38483360 -89.2507627j ])

%time np.fft.fft( np.random.randn(100003) )
Wall time: 1min 42s
Out[17]: 
array([  13.36160617  +0.j        , -314.86472577-340.44686425j,
       -258.36716707-170.43805382j, ...,  -21.18014704+441.3618185j ,
       -258.36716707+170.43805382j, -314.86472577+340.44686425j])

fft of length 1e6: 16 MILLIseconds

fft of length 1e6 + 3: 1 MINUTE and 42 SECONDS

@charris
Copy link
Member

charris commented Oct 18, 2017

Feature, not a bug. FFTPACK is only "fast" when the size factors as a product of the numbers 2, 3, 4, 5. There has been a long standing desire to use a faster algorithm for large prime sizes, but it has not been implemented. Note that 100003 is prime.

@charris charris closed this as completed Oct 18, 2017
@charris charris reopened this Oct 18, 2017
@endolith
Copy link
Contributor

I wouldn't call it a "feature", but it's normal and not a bug. :)

scipy/scipy#4288 has Bluestein's algorithm, but it requires pre-computation of a complex chirp, so would need to do some testing to see at which prime sizes it becomes worthwhile.

@standarddeviant
Copy link

Interesting. The main thing I know is that the default fft implementation for Julia and Matlab don't have this behavior. I'm curious what the Julia implementation does to avoid this behavior.

@charris
Copy link
Member

charris commented Oct 19, 2017

Julia and Matlab call FFTW, which we cannot do because of its license.

@charris
Copy link
Member

charris commented Oct 19, 2017

Note that there are Python bindings for FFTW; pyFFTW seems rather current. If FFT speed is a concern, that is probably the way to go. FFTPACK was a good implementation for its day, but code and hardware have moved on.

@standarddeviant
Copy link

@charris I definitely appreciate the info and that's unfortunate, but makes sense regarding the license.

@endolith
Copy link
Contributor

@mreineck
Copy link
Contributor

mreineck commented Apr 5, 2019

This should be fixed in numpy 1.17.

@rgommers
Copy link
Member

rgommers commented Apr 5, 2019

thanks @mreineck, closing

@rgommers rgommers closed this as completed Apr 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants