Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are extrema needed for ContinuousTerm? #222

Open
matthieugomez opened this issue Apr 13, 2021 · 0 comments
Open

Why are extrema needed for ContinuousTerm? #222

matthieugomez opened this issue Apr 13, 2021 · 0 comments

Comments

@matthieugomez
Copy link
Contributor

matthieugomez commented Apr 13, 2021

As I was trying to improve the performances of FixedEffectModels.jl, I noted that schema accounted for a substantial amount of time:

N = 10_000_000
df = (y = rand(N), x1 = rand(N), x2 = rand(N))
@time StatsModels.schema(@formula(y~x1+x2), df)
#  0.151931 seconds (49 allocations: 3.625 KiB)

The reason is that schema calls extrema on each term in the formula:

@time extrema(df.y), extrema(df.x1), extrema(df.x2)
#   0.127862 seconds (23 allocations: 1.422 KiB, 5.44% compilation time)

Is there a way to avoid computing these extrema? Why are they needed to begin with? (btw, calling extrema is slower than calling separately minimum and maximum (see JuliaLang/julia#31442)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant