Add argument `boot = FALSE`? #11

rempsyc · 2022-12-11T21:40:15Z

I know that this package is all about bootstrapping. I use your package inside my own package, rempsyc, when calculating effect sizes for contrasts. By default, it thus uses bootstrapping, but I would like to give my users to ability to not bootstrap, should they wish so.

Now, I could rely on a completely different package to accomplish this objective, but it would be simpler on my end if I could stick to the same package and only change a single option from the parameters, especially since few effect size packages to my knowledge explicitly support contrasts like you do.

Do you think there is any chance you might make this change? Alternatively, is there already a way to request no bootstraps through the ... argument passed to boot?

The text was updated successfully, but these errors were encountered:

dgerlanc · 2022-12-11T22:11:59Z

Thanks for opening an issue!

You can request to not bootstrap by passing an argument R=0, the number of bootstrap replicates (this matches the original boot package).

The non-resampled statistic is then available through the t0 element of the result.

rempsyc · 2022-12-12T00:41:38Z

Oh fantastic! I didn't know about this! Am I doing it right though? I'm getting the error below:

library(bootES)
#> Loading required package: boot
bootES(1:10, R=0)
#> Error in bootES(1:10, R = 0): R must be an integer of length 1 and greater than 0

^{Created on 2022-12-11 with reprex v2.0.2}

rempsyc · 2022-12-12T00:43:22Z

I think it comes from lines 20-22 of the bootES function:

r.is.valid = (length(R) == 1) && is.numeric(R) && R > 0
  if (!r.is.valid) 
    stop("R must be an integer of length 1 and greater than 0")

dgerlanc · 2022-12-13T00:04:44Z

I am getting the same error, which we added to avoid users passing an R less than 0.

I tested boot with an R=0 and it appears to work:

library(boot)

boot(1:100, function(x, i) mean(x[i]), R=0)
#> 
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#> 
#> 
#> Call:
#> boot(data = 1:100, statistic = function(x, i) mean(x[i]), R = 0)
#> 
#> 
#> Bootstrap Statistics :

^{Created on 2022-12-12 with reprex v2.0.2.9000}

I believe I should be able to update this check so that it has R >= 0. Will that work for you?

rempsyc · 2022-12-13T00:09:29Z

Yes, that would work well, thank you so much!

By the way, is there any argument to handle missing values, like na.rm = TRUE? That's another thing I am currently struggling with in rempsyc. I currently use na.omit on the data but for large data sets that excludes almost all of the data.

library(bootES)
#> Loading required package: boot
bootES(c(1:10, NA), R=1000)
#> Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o, : estimated adjustment 'w' is infinite

^{Created on 2022-12-12 with reprex v2.0.2}

dgerlanc · 2022-12-13T15:57:33Z

Currently, there is not any argument to handle missing data.

I'm not sure this can be handled automatically in bootES unless we defined bootstrapping methods that incorporated specific NA handling with associated assumptions and statistical adjustments. Otherwise, I imagine that excluding NAs in the statistic calculation functions would introduce bias.

I think the safest approach (and what we're doing now), at least without us thinking of a principled way to do this in bootES, is to fail when there is missing data, which forces the user to deal with it upstream.

rempsyc · 2022-12-13T16:06:00Z

I see. The new approach to handle this I am using in my convenience nice_contrasts function is:

es.lists <- lapply(groups.contrasts, function(y) {
    lapply(response.list, function(x) {
      bootES::bootES(
        data = stats::na.omit(data[, c(group, x)]),
        R = bootstraps,
        data.col = x,
        group.col = group,
        contrast = y,
        effect.type = effect.type,
        ...
      )
    })
  })

In this way, instead of feeding the whole data set (and excluding almost all variables with lots of missing data), it feeds a new data set with only the two variables: the group (for contrasts) and dependent variable of the current loop, thus minimizing damage/maximizing data for each DV.

dgerlanc · 2022-12-13T17:50:07Z

In this way, instead of feeding the whole data set (and excluding almost all variables with lots of missing data), it feeds a new data set with only the two variables: the group (for contrasts) and dependent variable of the current loop, thus minimizing damage.

Seems like the correct approach!

======

A thought - take the subset before calling the inner lapply to avoid running na.omit multiple times:

es.lists <- lapply(groups.contrasts, function(y) {
    data_column_subset = stats::na.omit(data[, c(group, x)]) 
    lapply(response.list, function(x) {
      bootES::bootES(
        data = data_column_subset,
        R = bootstraps,
        data.col = x,
        group.col = group,
        contrast = y,
        effect.type = effect.type,
        ...
      )
    })
  })

rempsyc · 2022-12-13T19:33:09Z

Thanks for the suggestion! I believe that would not work however, because x in this case refers to the lapply object response.list, which is a list of dependent variables (so I do need to run na.omit as many times as there are DVs). That allows me to do something like this:

library(rempsyc)
x <- nice_contrasts(data = mtcars, response = c("mpg", "disp"), group = "cyl")
x
#>   Dependent Variable Comparison df          t            p         d   CI_lower
#> 1                mpg      4 - 8 29   8.904534 8.568209e-10  3.587739  2.7021636
#> 2                mpg      6 - 8 29   3.111825 4.152209e-03  1.440495  0.8006059
#> 3                mpg      4 - 6 29   4.441099 1.194696e-04  2.147244  1.3594089
#> 4               disp      4 - 8 29 -11.920787 1.064054e-12 -4.803022 -5.7803822
#> 5               disp      6 - 8 29  -7.104461 8.117219e-08 -3.288726 -4.3449495
#> 6               disp      4 - 6 29  -3.131986 3.945539e-03 -1.514296 -2.2711031
#>     CI_upper
#> 1  4.5238179
#> 2  1.9758494
#> 3  3.0538567
#> 4 -3.8404060
#> 5 -2.3207227
#> 6 -0.8926851

nice_table(x)

^{Created on 2022-12-13 with reprex v2.0.2}

dgerlanc · 2022-12-22T23:32:15Z

Good point, would likely not help out in that case. The output looks great!

rempsyc · 2023-03-14T23:53:57Z

I believe I should be able to update this check so that it has R >= 0. Will that work for you?

I can submit a PR for this, would it be helpful? :)

dgerlanc · 2023-03-16T23:44:03Z

I'll get the PR in shortly!

Resolves: #11

rempsyc · 2023-04-11T22:07:36Z

library(bootES)
#> Loading required package: boot
packageVersion("bootES")
#> [1] '1.3.0'
bootES(1:10, R=0)
#> [1] "All values of t are equal to  NaN \n Cannot calculate confidence intervals"
#> Error in matrix(c(t0, x[["bounds"]], bias, std.error), nrow = 1, dimnames = list(NULL, : length of 'dimnames' [2] not equal to array extent

^{Created on 2023-04-11 with reprex v2.0.2}

We should probably add a test for that

dgerlanc · 2023-04-11T23:03:44Z

We should add a test for this!

It is a bit tricky because you will get the same error if you use an R=1.

Also, the error is thrown when the object is printed because that is when the confidence intervals are calculated:

"res = bootES(1:10, R=0)"
#> [1] "All values of t are equal to  NaN \n Cannot calculate confidence intervals"

^{Created on 2023-04-11 with reprex v2.0.2}

I will investigate where exactly CIs are being calculated outside of the call to print.bootES. I know part of the calculation happens on printing. Generally, we want to throw an warning/error if CIs can't be calculated as the goal of the package is to calculated CIs.

I'm ok with not calculating the CIs in the R=0 case as the intent of the user, in this case, is most likely to not calculate CIs, or they would not have passed R=0.

rempsyc · 2023-04-11T23:06:44Z

My goal is to allow my users to calculate and report confidence intervals, even when they do not wish to use bootstrapping (re: the table screenshot example I presented a few posts above). So I think that would be important actually 😬

If the calculation for the CI differs on bootstrapped values, one approach could be to detect how many simulations were used, and if < 2, use a classical approach to calculating the CI on Cohen's d.

dgerlanc · 2023-04-14T00:33:41Z

So how would you like this to work?

We don't really want to encourage folks to use classical approaches to CIs. If the simulations is < 2 I'd argue to just run more simulations!

rempsyc · 2023-04-14T01:21:32Z

So how would you like this to work?

The way I see it, quite simply, if R = 0, then compute classic CI, else, the bootstrapped CI. For R = 1, bootstrapped CI won't work you say? You could add an intentional error there, if you wanted.

We don't really want to encourage folks to use classical approaches to CIs. If the simulations is < 2 I'd argue to just run more simulations!

Well, if you don't want to do it by principle, I respect that. On my end, I have to provide what users request. Let me tell you a story. My statistical mentor who first introduced me to bootstrapped CIs recently told me that he's stopped using them because when he conduced his meta-analysis, it was a nightmare to find back the original p values and so on from the bootstrapped confidence intervals since they were not matching. So yeah, some want non-bootstrapped CIs.

But as package maintainers, I think we should offer choice, even though we can also provide a note in the documentation that this is not recommended and provide a reference.

Thoughts?

dgerlanc added a commit that referenced this issue Apr 9, 2023

Allow R=0 to disable bootstrap

c9e08db

Resolves: #11

dgerlanc mentioned this issue Apr 9, 2023

Allow R=0 to disable bootstrap #15

Merged

dgerlanc closed this as completed in #15 Apr 9, 2023

dgerlanc added a commit that referenced this issue Apr 9, 2023

Allow R=0 to disable bootstrap

f4789e7

Resolves: #11

dgerlanc reopened this Apr 14, 2023

rempsyc mentioned this issue Apr 26, 2023

Add argument boot = FALSE for nice_contrasts rempsyc/rempsyc#13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add argument `boot = FALSE`? #11

Add argument `boot = FALSE`? #11

rempsyc commented Dec 11, 2022

dgerlanc commented Dec 11, 2022

rempsyc commented Dec 12, 2022

rempsyc commented Dec 12, 2022

dgerlanc commented Dec 13, 2022

rempsyc commented Dec 13, 2022 •

edited

dgerlanc commented Dec 13, 2022

rempsyc commented Dec 13, 2022 •

edited

dgerlanc commented Dec 13, 2022 •

edited

rempsyc commented Dec 13, 2022 •

edited

dgerlanc commented Dec 22, 2022

rempsyc commented Mar 14, 2023

dgerlanc commented Mar 16, 2023

rempsyc commented Apr 11, 2023 •

edited

dgerlanc commented Apr 11, 2023

rempsyc commented Apr 11, 2023 •

edited

dgerlanc commented Apr 14, 2023

rempsyc commented Apr 14, 2023 •

edited

Add argument boot = FALSE? #11

Add argument boot = FALSE? #11

Comments

rempsyc commented Dec 11, 2022

dgerlanc commented Dec 11, 2022

rempsyc commented Dec 12, 2022

rempsyc commented Dec 12, 2022

dgerlanc commented Dec 13, 2022

rempsyc commented Dec 13, 2022 • edited

dgerlanc commented Dec 13, 2022

rempsyc commented Dec 13, 2022 • edited

dgerlanc commented Dec 13, 2022 • edited

rempsyc commented Dec 13, 2022 • edited

dgerlanc commented Dec 22, 2022

rempsyc commented Mar 14, 2023

dgerlanc commented Mar 16, 2023

rempsyc commented Apr 11, 2023 • edited

dgerlanc commented Apr 11, 2023

rempsyc commented Apr 11, 2023 • edited

dgerlanc commented Apr 14, 2023

rempsyc commented Apr 14, 2023 • edited

Add argument `boot = FALSE`? #11

Add argument `boot = FALSE`? #11

rempsyc commented Dec 13, 2022 •

edited

rempsyc commented Dec 13, 2022 •

edited

dgerlanc commented Dec 13, 2022 •

edited

rempsyc commented Dec 13, 2022 •

edited

rempsyc commented Apr 11, 2023 •

edited

rempsyc commented Apr 11, 2023 •

edited

rempsyc commented Apr 14, 2023 •

edited