Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use estimated quantiles instead of std #27

Open
baggepinnen opened this issue Aug 4, 2019 · 3 comments
Open

Use estimated quantiles instead of std #27

baggepinnen opened this issue Aug 4, 2019 · 3 comments

Comments

@baggepinnen
Copy link
Owner

Similar to #26
Instead of reducing particles to uncertainty by calling std, we can make use of the quantiles of the sample, at least if number of particles is large. Example code:

(a::Real,p::AbstractParticles,lim=2) = (mean(p)-a)/std(p) > lim

@baggepinnen
Copy link
Owner Author

Asking @cscherrer for input here: I currently have definitions like the following to determine if particles differ significantly from each other or from a scalar

Base.:(a::Real,p::AbstractParticles, lim=2) = abs(mean(p)-a)/std(p) < lim
(p::AbstractParticles,a::AbstractParticles,lim=2) = (mean(p)-mean(a))/(2sqrt(std(p)^2 + std(a)^2)) > lim

but I guess it would make sense to change these to using sample quantiles instead, at least if the number of particles is reasonably large (might not make sense for sigmapoints etc.). E.g., the following definition

Base.:(a::Real,p::AbstractParticles, q=0.025) = quantile(p,q) < a < quantile(p,1-q)

I think this makes sense, but am open to hear your thoughts on it

@cscherrer
Copy link

cscherrer commented Aug 6, 2019

Oh this is tricky - I think it really depends on what use case you're targeting.

If you're considering particles as proxies for random variables, I'd think the ordering matters (since permuted particles are no longer equal). So I'd probably go with "difference is not statistically significant", so maybe something like

function Base.:(a::AbstractParticles, b)
    lo,hi = quantile(a-b, (0.05,0.95))
    lo < 0.0 < hi
end

EDIT: I just realized this is the same as your example. Been a long day :)

If you want to consider them only as distributions, then I'd probably go with the Wasserstein or Earth mover distance

@baggepinnen
Copy link
Owner Author

Distribution free test statistic to determine if two samples come frome the same distribution.

A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Scholkopf, and A. Smola. ¨
A kernel two-sample test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants