-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added "one-sided" alternative for proportion_confint #9249
added "one-sided" alternative for proportion_confint #9249
Conversation
assert_equal(ci_upp, 1.0) | ||
|
||
ci_low, ci_upp = proportion_confint(100, 200) | ||
assert_almost_equal(ci_low, 0.4307048087825161) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do the comparison numbers come from?
we use now mainly assert_allclose with atol and/or rtol
(usage of assert_almost_equal comes mostly from before numpy had assert_allclose)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These numbers come from the latest version of statsmodels.stats.proportion_confint. Because I did not alter the calculation logic, those numbers must not be changed.
Should I change all assert_almost_equal to assert_allcose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually we add comparison numbers from other packages, mainly from R for stats.
A #
comment should mention the source, regression tests
if numbers are from an earlier statsmodels version.
about binomtest confidence intervals. scipy stats has now also confint and alternative options (which it did not have when we added those in statsmodels) AFAICS, the one sided binom test confints are the same as the beta. (although scipy does rootfinding also in the one-sided case) notation/definition boundary points |
Thanks for the PR |
Thank you so much for all the comments.
|
great and add the one-sided intervals to binom test method. jeffreys looks fine to me, it could be replace by isf and ppf in the two one-sided cases. but your current code using |
It seems like the outputs from the current version of statsmodels and R 4.3.3 do not match. statsmodels: proportion_confint(100, 200, method='binom_test') -> (0.42988371451156804, 0.5701162853923397) Do you have any ideas? |
The problem is base R,
There is an article that points this out and an R package that provides matching confidence interval for method binomtest you need a unit test separate from the other methods, because you cannot check that the one-sided interval can be obtained by doubling alpha. |
Thank you, Josef. Understood. scipy.stats.binomtest is not based on a distribution with equal tails. |
minlike and central/equal-tail are based on the same distribution, loglikelihood In symmetric distributions like normal, t, both methods are the same, but not in asymmetric distributions that we have in proportions and rates (Binomial and Poisson). (Coverage in approximate methods can still be slightly offcenter because of the approximation, actual tail probabilites are only approximately alpha/2) |
I committed new codes.
|
Sorry, I've forgotten to remove redundant parts. Please review the latest one. |
statsmodels/stats/proportion.py
Outdated
if alternative == 'two-sided': | ||
print(qi, stats.binomtest(count, nobs, p=qi).pvalue - alpha) | ||
return stats.binomtest(count, nobs, p=qi).pvalue - alpha | ||
elif alternative == 'smaller': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS, these will never be used because of the if above.
so we don't need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it from the latest commit.
|
||
|
||
if alternative == 'two-sided': | ||
if method != "binom_test": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not for binom_test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because scipy.stats.binomtest uses "two-sided" as default for alternative, the original alpha should be used instead of alpha / 2.0.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binomtest.html
looks good overall |
Thank you so much for your time. |
However, in your example: I will have to think about this later again. I'm also always getting confused. something like: at the upper limit of the left (lower) interval we reject the null hypothesis in favor of the alternative hypothesis that the proportion is larger than the value with a pvalue = alpha (in continuous case, and weak inequality in discrete case) |
AFAICS, my intuition was correct in our (my definition in statsmodels) the one-sided This means, the rejection region is in the direction of the alternative. |
I see that. So, the word 'alternative' corresponds to the reject region. |
Can you squash this? earlier commits are not relevant final version looks good to me and can be merged Thanks for the PR |
switched "larger" and "smaller" updated binom_test format styles added "one-sided" alternative for proportion_confint
a666bda
to
c828cab
Compare
I've been always helped by statsmodels in my research. |
unrelated unit test failure, merging |
If you are interested, then you could add alternative in the same way to poisson rates Those are all |
Let me take that. |
NumPy's guide.