Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DomainError from sqrt in ChisqTest #281

Open
andreasnoack opened this issue Nov 10, 2022 · 5 comments
Open

DomainError from sqrt in ChisqTest #281

andreasnoack opened this issue Nov 10, 2022 · 5 comments
Labels

Comments

@andreasnoack
Copy link
Member

andreasnoack commented Nov 10, 2022

Continued from @joshday's comment #125 (comment) in this new issue since the underlying bug is different. The reproducer is

julia> data = [
           946 2;
           1740 4;
           735 2;
           1390 3;
           203 0;
           2628 6
       ];

julia> ChisqTest(data)
Pearson's Chi-square Test
-------------------------
Population details:
    parameter of interest:   Multinomial Probabilities
    value under h_0:         [0.123501, 0.227201, 0.0960131, 0.181474, 0.0264459, 0.343146, 0.000274734, 0.000505419, 0.000213586, 0.000403697, 5.88303e-5, 0.000763344]
    point estimate:          [0.123515, 0.227184, 0.0959655, 0.181486, 0.0265048, 0.343126, 0.000261131, 0.000522261, 0.000261131, 0.000391696, 0.0, 0.000783392]
Error showing value of type PowerDivergenceTest:
ERROR: DomainError with -5.442823412516082:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
  [1] throw_complex_domainerror(f::Symbol, x::Float64)
    @ Base.Math ./math.jl:33
  [2] sqrt
    @ ./math.jl:591 [inlined]
  [3] ci_sison_glaz(x::PowerDivergenceTest, alpha::Float64; skew_correct::Bool)
    @ HypothesisTests ~/.julia/dev/HypothesisTests/src/power_divergence.jl:188
  [4] confint(x::PowerDivergenceTest; level::Float64, tail::Symbol, method::Symbol, correct::Bool, bootstrap_iters::Int64, GC::Bool)
    @ HypothesisTests ~/.julia/dev/HypothesisTests/src/power_divergence.jl:90
  [5] confint
    @ ~/.julia/dev/HypothesisTests/src/power_divergence.jl:70 [inlined]
  [6] show(_io::IOContext{Base.TTY}, test::PowerDivergenceTest)
    @ HypothesisTests ~/.julia/dev/HypothesisTests/src/HypothesisTests.jl:129
  [7] show(io::IOContext{Base.TTY}, #unused#::MIME{Symbol("text/plain")}, x::PowerDivergenceTest)
    @ Base.Multimedia ./multimedia.jl:47
  [8] (::REPL.var"#43#44"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:267
  [9] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:521
 [10] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:260
 [11] display(d::REPL.REPLDisplay, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:272
 [12] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:328
 [13] #invokelatest#2
    @ ./essentials.jl:729 [inlined]
 [14] invokelatest
    @ ./essentials.jl:726 [inlined]
 [15] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:296
 [16] (::REPL.var"#45#46"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:278
 [17] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:521
 [18] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:276
 [19] (::REPL.var"#do_respond#66"{Bool, Bool, REPL.var"#77#87"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:857
 [20] #invokelatest#2
    @ ./essentials.jl:729 [inlined]
 [21] invokelatest
    @ ./essentials.jl:726 [inlined]
 [22] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/LineEdit.jl:2510
 [23] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.8.1+0.x64/share/julia/stdlib/v1.8/REPL/src/REPL.jl:1248
 [24] (::REPL.var"#49#54"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:484

The issue is with this function

function ci_sison_glaz(x::PowerDivergenceTest, alpha::Float64; skew_correct::Bool=true)
k = length(x.thetahat)
probn = inv(pdf(Poisson(x.n), x.n))
c = 0
p = 0.0
p_old = 0.0
m1, m2, m3, m4, m5 = zeros(k), zeros(k), zeros(k), zeros(k), zeros(k)
mu = zeros(4)
for _c in 1:x.n
#run truncpoi
for i in 1:k
lambda = x.observed[i]
#run moments
a = lambda + _c
b = max(lambda - _c, 0)
poislama = cdf(Poisson(lambda), a)
poislamb = cdf(Poisson(lambda), b - 1)
den = b > 0.0 ? poislama-poislamb : poislama
for r in 1:4
plar = cdf(Poisson(lambda), a - r)
plbr = cdf(Poisson(lambda), b - r - 1)
poisA = ifelse( (a - r) >= 0, poislama - plar, poislama)
poisB = 0.0
if (b - r - 1) >= 0
poisB = poislamb - plbr
end
if (b - r - 1) < 0 && b - 1 >= 0
poisB = poislamb
end
if (b - r - 1) < 0 && (b - 1) < 0
poisB = 0.0
end
mu[r] = lambda^r * (1 - (poisA - poisB) / den)
end
# end of moments
m1[i] = mu[1]
m2[i] = mu[2] + mu[1] - mu[1]^2
m3[i] = mu[3] + mu[2] * (3 - 3mu[1]) + (mu[1] - 3mu[1]^2 + 2mu[1]^3)
m4[i] = mu[4] + mu[3] * (6 - 4mu[1]) + mu[2] * (7 - 12mu[1] + 6mu[1]^2) + mu[1] - 4mu[1]^2 + 6mu[1]^3 - 3mu[1]^4
m5[i] = den
end
for i in 1:k
m4[i] -= 3m2[i]^2
end
s1, s2, s3, s4 = sum(m1), sum(m2), sum(m3), sum(m4)
z = (x.n - s1) / sqrt(s2)

which is based on the SAS script from https://www.jstatsoft.org/index.php/jss/article/view/v005i06/621. I'm not sure what the issue is.

@ParadaCarleton
Copy link
Contributor

ParadaCarleton commented Dec 1, 2022

Having the same problem.

julia> approx_contingency = [
           (2609 - 325) * .27        325 * .16;
           (2609 - 325) * (1-.27)    325 * (1-.16);
       ]
2×2 Matrix{Float64}:
  616.68   52.0
 1667.32  273.0

julia> contingency = round.(Int, approx_contingency)
2×2 Matrix{Int64}:
  617   52
 1667  273

julia> ChisqTest(contingency)
Pearson's Chi-square Test
-------------------------
Population details:
    parameter of interest:   Multinomial Probabilities
    value under h_0:         [0.224478, 0.650953, 0.0319419, 0.0926269]
    point estimate:          [0.236489, 0.638942, 0.019931, 0.104638]
Error showing value of type PowerDivergenceTest:
ERROR: DomainError with -1.4430408161224477:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).

julia> MultinomialLRTest(contingency)
Multinomial Likelihood Ratio Test
---------------------------------
Population details:
    parameter of interest:   Multinomial Probabilities
    value under h_0:         [0.224478, 0.650953, 0.0319419, 0.0926269]
    point estimate:          [0.236489, 0.638942, 0.019931, 0.104638]
Error showing value of type PowerDivergenceTest:
ERROR: DomainError with -1.4430408161224477:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).

@andreasnoack
Copy link
Member Author

If somebody with access to SAS could try out the SAS macro, it would be a useful data point to know if the SAS macro fails in the same way or if our translation has introduced the issue.

@ParadaCarleton
Copy link
Contributor

If somebody with access to SAS could try out the SAS macro, it would be a useful data point to know if the SAS macro fails in the same way or if our translation has introduced the issue.

I don't have access to SAS, but if the specific confidence procedure isn't critical here, I think a Bayesian credible interval with an uninformative prior would be dramatically simpler and work just as well, if not better. The solution is just a conjugate Dirichlet distribution.

@devmotion
Copy link
Member

In general, credible intervals are no confidence intervals and don't satisfy any coverage guarantees.

@ParadaCarleton
Copy link
Contributor

ParadaCarleton commented Dec 3, 2022

In general, credible intervals are no confidence intervals and don't satisfy any coverage guarantees.

Not in general, but they do asymptotically for simple problems like this one. For binomial confidence intervals there's not really any popular interval constructions with exact coverage guarantees; all popular intervals are based on the Central Limit theorem. The uninformative prior credible interval is widely-used here, even by frequentists, because the frequentist coverage of the procedure is good even for very small samples--see e.g. this paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants