Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple matches with family argument in ggs() #73

Open
mdodrill-usgs opened this issue Apr 21, 2021 · 2 comments
Open

Multiple matches with family argument in ggs() #73

mdodrill-usgs opened this issue Apr 21, 2021 · 2 comments

Comments

@mdodrill-usgs
Copy link

Hi,

When using the family argument to pull only a set of named parameters from a fitted model object, the function ggs() returns multiple matches when the parameter names are similar (see example below). This isn't the behavior I expected when looking at the documentation for ggs() (i.e., A family of parameters is considered to be any group of parameters with the same name but different numerical value between square brackets (as beta[1], beta[2], etc).)

library(rstan)
library(ggmcmc)

# toy model
ex_model_code <- '
  parameters {
    real alpha[2,3];
    real alpha_2[2]; 
  } 
  model {
    for (i in 1:2) for (j in 1:3) 
      alpha[i, j] ~ normal(0, 1); 
    for (i in 1:2) 
      alpha_2 ~ normal(0, 2); 
  } '
fit <- stan(model_code = ex_model_code, chains = 4) 

f1 <- ggs(fit, family = "alpha")

# both alpha & alpha_2 are returned
unique(f1$Parameter)

Maybe this is the desired behavior, but a nice feature would be to only return the parameters that match the string before the square brackets (provided as the family argument). not multiple parameters (multiple matches, "alpha" and "alpha_2" in the example).

Thanks

@xfim
Copy link
Owner

xfim commented Apr 22, 2021

Hi @mdodrill-usgs . Thank you very much for using ggmcmc and for reporting issues.

In this case the behaviour is the one that I had in mind. Maybe the documentation is misleading and certainly I would have to change it depending on how this conversation goes.

The idea with family is to empower the user through the use of regular expressions. So basically anything in the family is a plain regular expression in R. This means that it is very easy to do things like:

  • Only strict alpha (but as a single parameter, without brackets: family = "^alpha$"
  • Only strict alpha and the brackets: family = "^alpha\\["
  • alpha or sigma.alpha, but not other alphas: family = "^alpha\\[|^sigma.alpha\\[")
  • Only the first element of the first dimension in alpha: family = "^alpha\\[1,")

So unless you have a strong preference and for changing its behaviour to return only what is in between the brackets in the family, I would prefer to keep the current situation. Another option would be, if you think this really adds to the current situation, to add another argument specifying whether the family should work as a regular expression (current situation, and default) or as a strict within-the-brakets character.

What do you think?

@mdodrill-usgs
Copy link
Author

Hi @xfim,

Thank you for the prompt reply and your work on ggmcmc.

I think just some small changes to the documentation would help to clarify the behavior when family is supplied as a character vector. Maybe something like, "When family is given as a character vector, any parameters containing the string supplied are returned (family = "beta", both beta[1] and beta.alpha[1] are returned)." could be added (if that is the correct logic of how the matching works with a character vector).

Also, for those not as familiar with building regular expressions (like me), maybe a condensed version of your response, above, could be added to the example in the documentation of ggs(). This would help to guide users on some handy expressions (and give an example of differences between the supplying family as a string or a regular expression).

Building off the little Stan model above, maybe something like this:

#' @examples 
#' \dontrun{
#' library(rstan)
#' 
#' # toy model
#' ex_model_code <- '
#' parameters {
#' real alpha[2,3];
#' real alpha_sigma[2];
#' } 
#' model {
#' for (i in 1:2) for (j in 1:3) 
#' alpha[i, j] ~ normal(0, 1); 
#' for (i in 1:2) 
#' alpha_sigma ~ normal(0, 2); 
#' } '
#' fit <- stan(model_code = ex_model_code, chains = 4)
#' 
#' # family as character vector, both alpha and alpha_sigma:
#' f1 <- ggs(fit, family = "alpha")
#' 
#' # only strict alpha and the brackets:
#' f1 <- ggs(fit, family = "^alpha\\[")
#' 
#' # Only the first element of the first dimension in alpha: 
#' f1 <- ggs(fit, family = "^alpha\\[1,")
#' 
#' # ect...
#' 
#' }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants