Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtual declarations missing for get_constrained_sizedtypes() and get_constrained_sizedtypes() in model_base #3175

Open
andrjohns opened this issue Mar 24, 2023 · 9 comments

Comments

@andrjohns
Copy link
Contributor

Summary:

stanc outputs methods for returning a json-formatted string of the parameter metadata, like the following for the bernoulli model:

  inline std::string get_constrained_sizedtypes() const {
    return std::string("[{\"name\":\"theta\",\"type\":{\"name\":\"real\"},\"block\":\"parameters\"}]");
  }
  inline std::string get_unconstrained_sizedtypes() const {
    return std::string("[{\"name\":\"theta\",\"type\":{\"name\":\"real\"},\"block\":\"parameters\"}]");
  }

But these methods don't have corresponding virtual methods in src/model/model_base.hpp

Current Version:

v2.31.0

@WardBrian
Copy link
Member

I’m not sure why these methods exist in the generated model to be honest, since nothing currently seems to use them.

Do you have a use in mind for them?

@andrjohns
Copy link
Contributor Author

I was after a way of getting the dimensions of the unconstrained parameters, and get_unconstrained_sizedtypes() looks like the only option

@andrjohns
Copy link
Contributor Author

Another option could be to parse the output of unconstrained_param_names(), but also open to other suggestions if there's something obvious I'm missing!

@bob-carpenter
Copy link
Contributor

I was after a way of getting the dimensions of the unconstrained parameters

What's the eventual goal?

I missed when those functions got added and would not have approved. I don't see the point of get_constrained_sizetypes() (or the prefix "get_"), as it just returns a string encoding name/size pairs. It'd be better to return a pair consisting of a std::vector<std::string> of strings and a std::vector<size_t> of sizes.

@WardBrian
Copy link
Member

They were added back in 2019 but never exposed or used anywhere: stan-dev/stanc3#226

Using std::vector<size_t> for objects is more or less singlehandedly holding up the tuples PR at this point, so avoiding using that is a win in my eyes.

The JSON these return is much richer anyway since it lets you distinguish matrix[3,2] and array[3,2] real, for example, and is probably the part of the compiler which was most ready for tuples/non-rectangular types, and would probably be useful for cmdstanpy/r when it comes time to process CSV outputs which contain tuples

@andrjohns
Copy link
Contributor Author

What's the eventual goal?

I'm updating the handling of unconstrained pars in cmdstanr to be structured using the draws formats from posterior (motivation in this issue).

For that I need to know their dimensions - given that they can differ from the constrained parameter dimensions returned by get_dims()

@bob-carpenter
Copy link
Contributor

I wrote back on the original issue asking why. I don't understand the motivation and have the same concerns as @mitzimorris expressed about the proposed solution.

I've always found the R devs to have almost diametrically opposed preferences to mine, which is for minimal but composable (i.e., unix-like) APIs. For example, I prefer the simple direct manipulations of NumPy to the do-it-all functions of the Tidyverse.

@andrjohns
Copy link
Contributor Author

I wrote back on the original issue asking why. I don't understand the motivation and have the same concerns as @mitzimorris expressed about the proposed solution.

Note that the proposed solution from the linked post (a named list) has since been eschewed in favour of the draws formats - which is what the results of sampling/optimisation/etc (i.e., the constrained parameters) currently use. In other words, it's aligning the APIs for the constrained and unconstrained parametetrs.

At a higher level, the motivation is that there are multiple use-cases which need to operate on the unconstrained parameters (e.g., moment-matching, power-scaling, plotting). Users/packages currently need to manually track the indexes of particular parameters in the single unlabeled numeric vector of values when they're performing operations or constructing their own unconstrained parameter inputs - not very user-friendly or robust to error.

Additionally, the current single numeric vector structure isn't terribly amenable to a workflow operating on multiple iterations across multiple chains.

Given that we have an existing API in the draws format for operating on parameters from multiple chains, as well as access to the model functions for unconstrained parameter dimensions, it seems like a good quality-of-life improvement for users

@andrjohns
Copy link
Contributor Author

andrjohns commented Mar 24, 2023

And also to clarify that the proposed solution in the linked post was using the existing structure for specifying initial values that is used by both rstan and cmdstanr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants