Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bounds on fixed effect priors #28

Open
nicholasjclark opened this issue Sep 12, 2023 · 5 comments
Open

Allow bounds on fixed effect priors #28

nicholasjclark opened this issue Sep 12, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@nicholasjclark
Copy link
Owner

Incorporating bounds on fixed effect priors can be done using the truncation syntax i.e.

b[1] ~ normal(0, 1)T[0, ]

This would require checks on inits to ensure they respect the correct bounding, but would be a useful feature to allow more domain expertise to be included in the prior

@A108669
Copy link

A108669 commented Feb 13, 2024

@nicholasjclark - I'm curious that in the meantime while this feature is considered/developed, are there any "hacks" in mvgam akin to this one in brms by which uses can enforce bounds on parameters? I believe the answer is no, but I figured I would not assume and ask 😄

I was checking out the forecasting course you have on your github, and I did see that mvgam does accept brms priors which have implemented up and lb parameters; however, brms has an open issue to allow priors on individual parameters. The above workaround requires a non-linear model functionality in brms which from my brief exploration of the source code is not leveraged.

@nicholasjclark
Copy link
Owner Author

Hi @A108669, unfortunately this can't be done currently with the mvgam interface. However it wouldn't be too difficult to do so by modifying the returned model code and then fitting the model. I don't go into details of how to do this in the vignettes, but it is something I regularly do when designing new features. Basically you need to construct your model but use run_model = FALSE. The returned object is of class mvgam_prefit and will contain all the data objects needed to fit the model, as well as the Stan code. You can modify the model and then fit it to the data using the regular cmdstanr functionality. This is all straightforward, but then you just need a few additional steps to gather the samples in the correct way, add residuals and update the structure of the object to class mvgam. If you are really interested in this then please let me know and I can produce a simple example to follow

@A108669
Copy link

A108669 commented Feb 14, 2024

Hi Matt Graziano, unfortunately this can't be done currently with the mvgam interface. However it wouldn't be too difficult to do so by modifying the returned model code and then fitting the model. I don't go into details of how to do this in the vignettes, but it is something I regularly do when designing new features. Basically you need to construct your model but use run_model = FALSE. The returned object is of class mvgam_prefit and will contain all the data objects needed to fit the model, as well as the Stan code. You can modify the model and then fit it to the data using the regular cmdstanr functionality. This is all straightforward, but then you just need a few additional steps to gather the samples in the correct way, add residuals and update the structure of the object to class mvgam. If you are really interested in this then please let me know and I can produce a simple example to follow

Thank you, @nicholasjclark, for all the help. If you are open to nudging me the right way, I would greatly appreciate it. In a dummy example (code below), I was able to, per your instructions, write the mvgam stan code to a .stan file, edit it to include bounds on a parameter, and fit the model with cmdstanr.

a few additional steps to gather the samples in the correct way, add residuals and update the structure of the object to class mvgam

Any help on this piece would be appreciated. I'm assuming I need to somehow inject posterior_samples <- fit$draws() into the mvgam object?


I found code here that shows how to pull the stan code from mvgam and execute it with cmdstanr. I'm having trouble piecing together how to inject the samples and residuals back into the mvgam object. I believe some of the required manipulation is here, but I am still working through understanding the source code. Am I looking in the right place?

library(mvgam)
library(cmdstanr)
set.seed(1000)
simdat <- sim_mvgam(T = 100, 
                    n_series = 1, 
                    trend_model = 'GP',
                    prop_trend = 0.75,
                    family = poisson(),
                    prop_missing = 0.10)

form = formula(y ~ 1 + time)
mod1 <- mvgam(form,
              trend_model = 'None',
              data = simdat$data_train,
              backend="cmdstanr",
              use_stan = TRUE,
              run_model = FALSE)

model_data <- mod1$model_data
stan_file_path <- write_stan_file(mod1$model_file)

#** Edit stan_file_path in editor **#
# Add <upper/lower> to parameters
# 
# parameters {
#   // raw basis coefficients
#   vector [num_basis] b_raw;
# }
# 
# parameters {
#   // raw basis coefficients
#   vector<lower=0> [num_basis] b_raw;
# }

# Also can bound via the prior such as  b_raw[2] ~ student_t(3, 0, 2) T[,0];

cmd_mod <- cmdstan_model(stan_file_path,
                         stanc_options = list('canonicalize=deprecations,braces,parentheses'))
cmd_mod$print()

fit <- cmd_mod$sample(data = model_data,
                     chains = 4,
                     refresh = 100)

fit$summary()

posterior_samples <- fit$draws(format="draws_matrix")

@nicholasjclark
Copy link
Owner Author

nicholasjclark commented Feb 19, 2024

Hi @A108669, sorry about the slow reply. Good to see that you've worked out all the modification and sampling steps. As you have probably notices, mvgam uses a rather clunky setup for the betas by sampling the beta_raw parameters and then creating the b parameters in the transformed_parameters block. I did this so that I could use manipulations such as drawing non-centred hierarchical parameters, but it also allows for different kinds of constraints to be added in the future. In that example you sent, I think forcing b[2] = exp(b_raw[2]) would be the most efficient way to do this. But note that I haven't declared the lower=0 constraint on b, as this would involve setting variable constraints because we only want one of the b's to be constrained and I'm not sure of the best way to do that. This could mean that initial values chosen by the sampler will give problems, but I tend to find that isn't an issue if you are using sampling (it may be more of an issue for variational inference though).

Once you've done the sampling, there are really just a few steps that use internal functions to get the object in the right structure. Here is your example completed with these steps:

library(mvgam)
library(cmdstanr)
set.seed(1000)
simdat <- sim_mvgam(T = 100, 
                    n_series = 1, 
                    trend_model = 'GP',
                    prop_trend = 0.75,
                    family = poisson(),
                    prop_missing = 0.10)

form = formula(y ~ 1 + time)
mod1 <- mvgam(form,
              trend_model = 'None',
              data = simdat$data_train,
              backend="cmdstanr",
              use_stan = TRUE,
              run_model = FALSE)

model_data <- mod1$model_data
stan_file_path <- write_stan_file(mod1$model_file)

#** Edit stan_file_path in editor **#
# transformed parameters {
#   // basis coefficients
#   vector[num_basis] b;
#   b[1 : num_basis] = b_raw[1 : num_basis];
#   
#   // constrain time coeffient to be positive
#   b[2] = exp(b_raw[2]);
# }

cmd_mod <- cmdstan_model(stan_file_path,
                         stanc_options = list('canonicalize=deprecations,braces,parentheses'))
cmd_mod$print()

fit <- cmd_mod$sample(data = model_data,
                      chains = 4,
                      refresh = 100)


# fit is a cmdstanr object, so we need to convert to stanfit
out_gam_mod <- mvgam:::read_csv_as_stanfit(fit$output_files())
out_gam_mod <- mvgam:::repair_stanfit(out_gam_mod)
algorithm <- 'sampling'
if(algorithm %in% c('meanfield', 'fullrank',
                    'pathfinder', 'laplace')){
  out_gam_mod@sim$iter <- samples
  out_gam_mod@sim$thin <- 1
  out_gam_mod@stan_args[[1]]$method <- 'sampling'
}

# start building the mvgam object
mod2 <- mod1
mod2$model_output <- out_gam_mod
mod2$model_file <- readLines(cmd_mod$stan_file())
mod2$max_treedepth <- 12
class(mod2) <- 'mvgam'

# calculate Dunn-Smyth residual distributions
mod2$resids <- mvgam:::dsresids_vec(mod2)

# not essential, but in case you want the estimated p-values
# of any smooths for the summary()
ss_gam <- mod2$mgcv_model
V <- cov(mvgam:::mcmc_chains(out_gam_mod, 'b'))
ss_gam$Ve <- ss_gam$Vp <- ss_gam$Vc <- V
p <- mvgam:::mcmc_summary(out_gam_mod, 'b',
                          variational = algorithm %in% c('meanfield',
                                                         'fullrank',
                                                         'pathfinder',
                                                         'laplace'))[,c(4)]
names(p) <- names(ss_gam$coefficients)
ss_gam$coefficients <- p
ss_gam <- mvgam:::compute_edf(ss_gam, mod2, 'rho', 'sigma_raw')
mod2$mgcv_model <- ss_gam

# all functions should now work
summary(mod2)
mcmc_plot(mod2, variable = 'b', regex = TRUE)
conditional_effects(mod2)
plot(mod2)

@A108669
Copy link

A108669 commented Feb 29, 2024

@nicholasjclark - Thank you a ton for the example. Extremely helpful!

@nicholasjclark nicholasjclark added the enhancement New feature or request label May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants