Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize lower and upper bounds separately #367

Open
bwiernik opened this issue Aug 17, 2021 · 7 comments
Open

Optimize lower and upper bounds separately #367

bwiernik opened this issue Aug 17, 2021 · 7 comments
Assignees
Labels
enhancement 🔥 New feature or request WIP 👷‍♂️ work in progress

Comments

@bwiernik
Copy link
Contributor

I am thinking about this bit from the documentation:

#' @section CI Does Not Contain the Estimate:
#' For very large sample sizes, the width of the CI can be smaller than the
#' tolerance of the optimizer, resulting in CIs of width 0. This can also,
#' result in the estimated CIs excluding the point estimate. For example:
#' ```{r}
#' t_to_d(80, df_error = 4555555)
#' ```

This surprised me, because MBESS doesn't have an issue with it (and neither does Steiger's NDC program for similar situations: https://www.statpower.net/Software.html):

MBESS::conf.limits.nct(80, 4555555)
#> [1] "The observed noncentrality parameter of the noncentral t-distribution has exceeded 37.62 in magnitude (R's limitation for accurate probabilities from the noncentral t-distribution) in the function's iterative search for the appropriate value(s). The results may be fine, but they might be inaccurate; use caution."
#> $Lower.Limit
#> [1] 78.03934
#> 
#> $Prob.Less.Lower
#> [1] 0.025
#> $Upper.Limit
#> [1] 81.96065
#> 
#> $Prob.Greater.Upper
#> [1] 0.025

Created on 2021-08-17 by the reprex package (v2.0.0)

I think the issue for us is the simultaneous optimization of the lower and upper limits. I think that might be lowering the effective tolerance for the two values individually. Exploring the various methods in optim(), most fail to yield a sensible solution, but "SANN" gets the correct lower bound, but an incorrect upper bound. "Brent" (unidimensional) gets both limits correct.

I am wondering whether we should switch to optimizing the two limits separately using either optimize() or nlm(). MBESS::conf.limits.nct() uses both and retains the best result:

.conf.limits.nct.M1 <- function(ncp, df, conf.level = NULL, 
    alpha.lower, alpha.upper, tol = 1e-09, sup.int.warns = TRUE, 
    ...) {
    if (sup.int.warns == TRUE) 
      Orig.warn <- options()$warn
    options(warn = -1)
    min.ncp = min(-150, -5 * ncp)
    max.ncp = max(150, 5 * ncp)
    .ci.nct.lower <- function(val.of.interest, ...) {
      (qt(p = alpha.lower, df = df, ncp = val.of.interest, 
        lower.tail = FALSE, log.p = FALSE) - ncp)^2
    }
    .ci.nct.upper <- function(val.of.interest, ...) {
      (qt(p = alpha.upper, df = df, ncp = val.of.interest, 
        lower.tail = TRUE, log.p = FALSE) - ncp)^2
    }
    if (alpha.lower != 0) {
      if (sup.int.warns == TRUE) 
        Low.Lim <- suppressWarnings(optimize(f = .ci.nct.lower, 
          interval = c(min.ncp, max.ncp), alpha.lower = alpha.lower, 
          df = df, ncp = ncp, maximize = FALSE, tol = tol))
      if (sup.int.warns == FALSE) 
        Low.Lim <- optimize(f = .ci.nct.lower, interval = c(min.ncp, 
          max.ncp), alpha.lower = alpha.lower, df = df, 
          ncp = ncp, maximize = FALSE, tol = tol)
    }
    if (alpha.upper != 0) {
      if (sup.int.warns == TRUE) 
        Up.Lim <- suppressWarnings(optimize(f = .ci.nct.upper, 
          interval = c(min.ncp, max.ncp), alpha.upper = alpha.upper, 
          df = df, ncp = ncp, maximize = FALSE, tol = tol))
      if (sup.int.warns == FALSE) 
        Up.Lim <- optimize(f = .ci.nct.upper, interval = c(min.ncp, 
          max.ncp), alpha.upper = alpha.upper, df = df, 
          ncp = ncp, maximize = FALSE, tol = tol)
    }
    if (alpha.lower == 0) 
      Result <- list(Lower.Limit = -Inf, Prob.Less.Lower = 0, 
        Upper.Limit = Up.Lim$minimum, Prob.Greater.Upper = pt(q = ncp, 
          ncp = Up.Lim$minimum, df = df))
    if (alpha.upper == 0) 
      Result <- list(Lower.Limit = Low.Lim$minimum, Prob.Less.Lower = pt(q = ncp, 
        ncp = Low.Lim$minimum, df = df, lower.tail = FALSE), 
        Upper.Limit = Inf, Prob.Greater.Upper = 0)
    if (alpha.lower != 0 & alpha.upper != 0) 
      Result <- list(Lower.Limit = Low.Lim$minimum, Prob.Less.Lower = pt(q = ncp, 
        ncp = Low.Lim$minimum, df = df, lower.tail = FALSE), 
        Upper.Limit = Up.Lim$minimum, Prob.Greater.Upper = pt(q = ncp, 
          ncp = Up.Lim$minimum, df = df))
    if (sup.int.warns == TRUE) 
      options(warn = Orig.warn)
    return(Result)
  }
  .conf.limits.nct.M2 <- function(ncp, df, conf.level = NULL, 
    alpha.lower, alpha.upper, tol = 1e-09, sup.int.warns = TRUE, 
    ...) {
    .ci.nct.lower <- function(val.of.interest, ...) {
      (qt(p = alpha.lower, df = df, ncp = val.of.interest, 
        lower.tail = FALSE, log.p = FALSE) - ncp)^2
    }
    .ci.nct.upper <- function(val.of.interest, ...) {
      (qt(p = alpha.upper, df = df, ncp = val.of.interest, 
        lower.tail = TRUE, log.p = FALSE) - ncp)^2
    }
    if (sup.int.warns == TRUE) {
      Low.Lim <- suppressWarnings(nlm(f = .ci.nct.lower, 
        p = ncp, ...))
      Up.Lim <- suppressWarnings(nlm(f = .ci.nct.upper, 
        p = ncp, ...))
    }
    if (sup.int.warns == FALSE) {
      Low.Lim <- nlm(f = .ci.nct.lower, p = ncp, ...)
      Up.Lim <- nlm(f = .ci.nct.upper, p = ncp, ...)
    }
    if (alpha.lower == 0) 
      Result <- list(Lower.Limit = -Inf, Prob.Less.Lower = 0, 
        Upper.Limit = Up.Lim$estimate, Prob.Greater.Upper = pt(q = ncp, 
          ncp = Up.Lim$estimate, df = df))
    if (alpha.upper == 0) 
      Result <- list(Lower.Limit = Low.Lim$estimate, Prob.Less.Lower = pt(q = ncp, 
        ncp = Low.Lim$estimate, df = df, lower.tail = FALSE), 
        Upper.Limit = Inf, Prob.Greater.Upper = 0)
    if (alpha.lower != 0 & alpha.upper != 0) 
      Result <- list(Lower.Limit = Low.Lim$estimate, Prob.Less.Lower = pt(q = ncp, 
        ncp = Low.Lim$estimate, df = df, lower.tail = FALSE), 
        Upper.Limit = Up.Lim$estimate, Prob.Greater.Upper = pt(q = ncp, 
          ncp = Up.Lim$estimate, df = df))
    return(Result)
  }
@mattansb
Copy link
Member

(Wow, that is some monster code (to maintain) right there...)

I am all for what ever we can do to make the CIs more accurate.

I suggest waiting for #366 to be merged, and then you can make what ever changes you want to utils_ncp_ci.R.
Specifically, it would be best to update those functions (for F, t, and Chi) as is (i.e. keep the function names and argument names / order) for minimal hassle.

Sound good?

@mattansb mattansb assigned bwiernik and mattansb and unassigned mattansb Aug 18, 2021
@mattansb mattansb added enhancement 🔥 New feature or request WIP 👷‍♂️ work in progress labels Aug 18, 2021
@bwiernik
Copy link
Contributor Author

Yeah that's what I was thinking

@mattansb
Copy link
Member

Alright, we're ready for you now Dr. Wiernik.

@bwiernik
Copy link
Contributor Author

bwiernik commented Aug 19, 2021

On my way

image

@mattansb
Copy link
Member

I'm planning on submitting to CRAN when I get back (Sep 1st). Do you think this will be ready by then?
Or for the next cycle?

@mattansb
Copy link
Member

I'm BACK!

@bwiernik , what's up doc?

@bwiernik
Copy link
Contributor Author

Should be able to do this this week I think. Need to catch up on teaching a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🔥 New feature or request WIP 👷‍♂️ work in progress
Projects
None yet
Development

No branches or pull requests

2 participants