Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bin and reduce #893

Open
jerlich opened this issue Sep 19, 2023 · 0 comments
Open

bin and reduce #893

jerlich opened this issue Sep 19, 2023 · 0 comments

Comments

@jerlich
Copy link

jerlich commented Sep 19, 2023

I very regularly want to bin some 2D data based on the first element and then apply a grouping function to the 2nd element.

I wrote this for 2D case, but I have similar functions for the 3D case (bin on x,y and apply function to z)

"""
`binned(x,y,bins, μ, Ε)`

Takes vectors `x,y` of equal length than bins `x` according to `bins`. Apply `µ,E` (often mean) to `y` grouped by bins of `x`.

`µ` is any function that takes a vector and return a number (usually `mean`)
`E` is any function that takes a vector and returns a 2-element iterable, which represent the lower and upper CI for that bin.

Returns a 3-tuple with `(bin_c,  µ_y, E_y)` such that you can plot the results with `plot(bin_c, µ_y, yerr=E_y)`
"""
binned(x,y, bins, μ, Ε) = begin
   @assert length(x) == length(y)
    h = fit(Histogram, x, bins)

    ox = (bins[1:end-1] + bins[2:end])/2
    xmap = StatsBase.binindex.(Ref(h), x)

    oy = [sum(z.==xmap) > 0 ? μ(y[z.==xmap]) : NaN for z in 1:length(ox)]
    oe = [sum(z.==xmap) > 0 ? Ε(y[z.==xmap]) : [NaN, NaN] for z in 1:length(ox)]
    # This returns a long list of 2-tuples, but we want a 2-tuple of vectors
    (ox, oy, 	(oy .- (x->x[1]).(oe), (x->x[2]).(oe) .- oy))
end

would something like this (but cleaned up/tested/etc) fit into StatsBase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant