Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cumulant function is not numerically stable #899

Open
mattcbro opened this issue Oct 19, 2023 · 4 comments
Open

Cumulant function is not numerically stable #899

mattcbro opened this issue Oct 19, 2023 · 4 comments
Labels

Comments

@mattcbro
Copy link

The recursion used for calculating cumulants from a sequence of data is not numerically stable. Apparently this is known in the literature and is discussed in this paper:
https://www-2.rotman.utoronto.ca/~kan/papers/ncq.pdf

I offer a simple test showing that the cumulants of a gaussian vector, which should all be zero after the second one, diverge.

using StatsBase
using Plots

# create a long sampled vector from a normal distribution
N = 10000
v = randn(N)


# compute and plot the cumulants.  Only the second
# one should be nonzero
kix = 1:15
rkstats = cumulant(v, kix)
plot(kix, rkstats)
@ParadaCarleton
Copy link
Contributor

Thank you so much for opening this issue! Would you be willing to open a PR implementing the more precise algorithms? (Is there any performance cost to doing so?)

@alecloudenback
Copy link

alecloudenback commented Oct 28, 2023

Scipy restricts the range to 1:4, perhaps to avoid the numerical instabilities?

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstat.html

@mattcbro
Copy link
Author

Thank you so much for opening this issue! Would you be willing to open a PR implementing the more precise algorithms? (Is there any performance cost to doing so?)

I'll look into it. I haven't tried to implement the one in the paper. Since it requires a solution to an eigenvalue problem it may be slower, but correctness is of course more important.

@ParadaCarleton
Copy link
Contributor

ParadaCarleton commented Oct 31, 2023

That would be great, thank you!

It might also be worth checking out what OnlineStats.jl does. IME it usually has cleaner/faster/more accurate implementations than we do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants