Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use certain internal brms functions in non-linear formulas #1635

Open
dbarneche opened this issue Mar 26, 2024 · 11 comments
Open

Use certain internal brms functions in non-linear formulas #1635

dbarneche opened this issue Mar 26, 2024 · 11 comments
Labels
Milestone

Comments

@dbarneche
Copy link

Hello,

Since the latest CRAN release, I've been having issues adding LOO via add_criterion for a non-linear model.
This has prompted CRAN to ask us for a fix in one of our packages (bayesnec) that depends on brms.
Please see reprex below.

Thanks for your help!

library(brms)
rstan::rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

fct <- function(beta, bot, nec, top, x) {
  bot + (top - bot) * exp(-exp(beta) * (x - nec) * 
    ifelse(x - nec < 0, 0, 1))
}

set.seed(10)
x <- abs(rnorm(100, 1))
y <- fct(beta = 1.2, bot = 0.01, nec = 1.5, top = 0.95, x)
data <- data.frame(x, y)
# plot(y ~ x, data)

# linear
test_l <- brm(
  y ~ x, data = data, iter = 200, warmup = 150, chains = 2, seed = 10
)
add_criterion(test_l, "loo")

nl_mod <- brms::bf(
  y ~ bot + (top - bot) * exp(-exp(beta) * (x - nec) * step(x - nec)),
  bot + top + beta + nec ~ 1, nl = TRUE
)
test_nl <- brm(
  nl_mod, data = data, iter = 200, warmup = 150, chains = 2, seed = 10
)
add_criterion(test_nl, "loo") # returns an error
# Error: Error in x$terms : $ operator is invalid for atomic vectors

Session

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 14.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

time zone: Australia/Perth
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brms_2.21.0 Rcpp_1.0.12

loaded via a namespace (and not attached):
 [1] gtable_0.3.4         tensorA_0.36.2.1     ggplot2_3.5.0        QuickJSR_1.1.3       processx_3.8.4       inline_0.3.19        lattice_0.21-8      
 [8] callr_3.7.5          vctrs_0.6.5          tools_4.3.0          ps_1.7.6             generics_0.1.3       stats4_4.3.0         curl_5.0.2          
[15] parallel_4.3.0       sandwich_3.0-2       tibble_3.2.1         fansi_1.0.6          pkgconfig_2.0.3      Matrix_1.6-1         checkmate_2.3.1     
[22] distributional_0.4.0 RcppParallel_5.1.7   lifecycle_1.0.4      compiler_4.3.0       stringr_1.5.1        Brobdingnag_1.2-9    munsell_0.5.0       
[29] codetools_0.2-19     bayesplot_1.11.1     pillar_1.9.0         MASS_7.3-60          StanHeaders_2.32.6   bridgesampling_1.1-2 abind_1.4-5         
[36] multcomp_1.4-25      nlme_3.1-163         posterior_1.5.0      rstan_2.32.6         tidyselect_1.2.1     digest_0.6.35        mvtnorm_1.2-4       
[43] stringi_1.8.3        dplyr_1.1.4          reshape2_1.4.4       splines_4.3.0        grid_4.3.0           colorspace_2.1-0     cli_3.6.2           
[50] magrittr_2.0.3       loo_2.7.0            survival_3.5-7       pkgbuild_1.4.4       utf8_1.2.4           TH.data_1.1-2        scales_1.3.0        
[57] backports_1.4.1      estimability_1.4.1   matrixStats_1.2.0    emmeans_1.8.7        gridExtra_2.3        zoo_1.8-12           coda_0.19-4.1       
[64] V8_4.3.3             rstantools_2.4.0     rlang_1.1.3          xtable_1.8-4         glue_1.7.0           jsonlite_1.8.8       R6_2.5.1            
[71] plyr_1.8.9          
@paul-buerkner
Copy link
Owner

paul-buerkner commented Mar 26, 2024 via email

@dbarneche
Copy link
Author

Thanks very much Paul, that certainly clarified things.
The quick solution for us would be to export a copy of brms:::step's implementation in bayesnec
Before I push that fix to our package though, two quick questions:

1 - was there any reason why you decided not to export brms:::step in the first place?

2 - are there any other functions like step that might cause the same issue with formula evaluation?

@paul-buerkner
Copy link
Owner

1 - because I don't want to clutter the global env with a lot of these brms functions, in particular if they mask other functions as is the case for step. TBH I implemented a lot of the stan functions in brms for precisely the reason you are using them now (in non-linear formulas). However, I have yet to find a safe way to expose just exactly these functions to the formula environment without exposing all of brms internal functions to it which may lead to all sort of hard to debug problems for users.

2 - all internal brms functions one might want to use in non-linear formulas.

If you have a suggestion for how to fix 1 from the brms side, I would be happy to hear!

@paul-buerkner paul-buerkner changed the title Cannot add LOO to non-linear models Use certain internal brms functions in non-linear formulas Mar 26, 2024
@paul-buerkner paul-buerkner added this to the brms 2.22.0 milestone Mar 26, 2024
@dbarneche
Copy link
Author

Thanks Paul, that makes sense. I don't have any solutions for problem 1, unfortunately, but I'll keep my eyes peeled in case I come across any potential ideas. For now, we'll stick with exporting step.

@wds15
Copy link
Contributor

wds15 commented Apr 1, 2024

Could one define inside the brms namespace an environment which has the Stan internal functions and then one places this environment into the search path in some magic way? E.g. could we say that the formula has as parent environment a copy of this Stan functions namespace which itself has as parent the environement of the user which was used to define the brms formula? I am not a guru on these R search path stuff, but maybe this is a solution to follow on?

@paul-buerkner
Copy link
Owner

Yeah, I like the idea. I think this is doable.

@venpopov
Copy link
Contributor

venpopov commented Apr 1, 2024

I was going to suggest something very similar. A quick test with just the step function as an example show it works:

env <- new.env()
env$step <- brms:::step

environment(test_nl$formula$formula) <- new.env(parent = env)
add_criterion(test_nl, "loo")

Should be pretty safe, since this internal stan functions environment won't be exposed to any other functions, just to things that work with the formula

@paul-buerkner
Copy link
Owner

for new formula envs it is safe. for existing formula envs like global-env it may not be super safe without making sure that the env changes are reverted after evaluation.

@wds15
Copy link
Contributor

wds15 commented Apr 2, 2024

But how about this pattern:

cur_env <- environment(test_nl$formula$formula) ## not sure if this copies...
env <- new.env(parent = cur=env)
env$step <- brms:::step

environment(test_nl$formula$formula) <- new.env(parent = env)
add_criterion(test_nl, "loo")

Essentially one should be able to "sandwich" a pre-populated Stan functions environment into the child-parent structure of environments.

@paul-buerkner
Copy link
Owner

yeah, I think this could work. Are there any dangers of side effects here?

@wds15
Copy link
Contributor

wds15 commented Apr 2, 2024

I'd hope not, but I am not confident. I read through the R documentation of environments and how they are searched - but I have not messed myself with environments at this level.

A ramification is obviously that the functions in the Stan internal functions environment get precedence over what is found in the environment of the formula - but that would be intended, I guess. It get's a bit tricky with the vectorisation if the functions, I guess (so the Stan internal functions as implemented in R need to be clever in this regard).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants