Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bounds #10

Open
kellijohnson-NOAA opened this issue Feb 9, 2015 · 9 comments
Open

bounds #10

kellijohnson-NOAA opened this issue Feb 9, 2015 · 9 comments

Comments

@kellijohnson-NOAA
Copy link
Contributor

Hello all - I am looking for guidance on how we should standardize the parameter bounds for: a) parameters that are initialized at zero, b) negative parameters, c) all other parameters.

In the last study I used a lower bound of 0.5% and an upper bound of 500% or 1000% for most parameters and a range of -20 to 20 for log catchability. Using the double normal for the selectivity curves instead of logistic has added some new complications. Below are some of the parameters in the cod model (Low, High, INIT). Some of the selectivity parameters are not estimated, but if we were to attempt to estimate dome-shaped selectivity it will become an issue. Is there a way we can specify a distribution (truncated normal or normal) for each parameter and provide a sd to obtain values from the percentiles as bounds? That way it would not matter if a parameter was negative, had an INIT of zero, or was bound at zero.

.001 1 .2 # NatM_p_1_Fem_GP_1
.1 200 20 # L_at_Amin_Fem_GP_1
.66 1320 132 # L_at_Amax_Fem_GP_1
.001 1 0.2 # VonBert_K_Fem_GP_1
5e-04 .5 .1 # CV_young_Fem_GP_1
5e-04 .5 .1 # CV_old_Fem_GP_1
.0935 93.5 18.6996 # SR_LN(R0)
0 13.5 2.7 # SizeSel_1P_1_Fishery Fishery PEAK value
-5 3 -1.0 # SizeSel_1P_2_Fishery Fishery TOP logistic
-4 12 0 # SizeSel_1P_3_Fishery Fishery WIDTH exp
-2 75 15.0 # SizeSel_1P_4_Fishery Fishery WIDTH exp
-15 5 -999 # SizeSel_1P_5_Fishery Fishery INIT logistic
-5 5 -999 # SizeSel_1P_6_Fishery Fishery FINAL logistic
@cstawitz
Copy link
Contributor

cstawitz commented Feb 9, 2015

Kelli - I can modify the function to special-case parameters containing
"SizeSel" as well as lnQ such that instead of percentages the bounds are
set to whatever the used inputs.

Using a truncated normal or double normal sounds more elegant, but then we
need to put the burden on the user to do the extra work of calculating a
reasonable SD for each parameter for each stock. A CV would be easier since
then it would presumably be more similar across parameters and life
histories, but then we would still need to special case the negative
parameters since CV's don't really make sense for negative values.

I think in the meantime I like the simpler option of just special casing
the selectivity parameters.

On Mon, Feb 9, 2015 at 9:39 AM, Kelli Johnson notifications@github.com
wrote:

Hello all - I am looking for guidance on how we should standardize the
parameter bounds for: a) parameters that are initialized at zero, b)
negative parameters, c) all other parameters.

In the last study I used a lower bound of 0.5% and an upper bound of 500%
or 1000% for most parameters and a range of -20 to 20 for log catchability.
Using the double normal for the selectivity curves instead of logistic has
added some new complications. Below are some of the parameters in the cod
model (Low, High, INIT). Some of the selectivity parameters are not
estimated, but if we were to attempt to estimate dome-shaped selectivity it
will become an issue. Is there a way we can specify a distribution
(truncated normal or normal) for each parameter and provide a sd to obtain
values from the percentiles as bounds? That way it would not matter if a
parameter was negative, had an INIT of zero, or was bound at zero.

.001 1 .2 # NatM_p_1_Fem_GP_1
.1 200 20 # L_at_Amin_Fem_GP_1
.66 1320 132 # L_at_Amax_Fem_GP_1
.001 1 0.2 # VonBert_K_Fem_GP_1
5e-04 .5 .1 # CV_young_Fem_GP_1
5e-04 .5 .1 # CV_old_Fem_GP_1
.0935 93.5 18.6996 # SR_LN(R0)
0 13.5 2.7 # SizeSel_1P_1_Fishery Fishery PEAK value
-5 3 -1.0 # SizeSel_1P_2_Fishery Fishery TOP logistic
-4 12 0 # SizeSel_1P_3_Fishery Fishery WIDTH exp
-2 75 15.0 # SizeSel_1P_4_Fishery Fishery WIDTH exp
-15 5 -999 # SizeSel_1P_5_Fishery Fishery INIT logistic
-5 5 -999 # SizeSel_1P_6_Fishery Fishery FINAL logistic


Reply to this email directly or view it on GitHub
#10.

Christine Stawitz
Quantitative Ecology and Resource Management
Box 355020
University of Washington
Seattle, WA 98195
cstawitz@uw.edu

http://students.washington.edu/cstawitz/

@cstawitz
Copy link
Contributor

cstawitz commented Feb 9, 2015

Credit to Juan for a lot of that input - thanks for stopping by to chat
about it Juan!

On Mon, Feb 9, 2015 at 11:12 AM, Christine Stawitz cstawitz@uw.edu wrote:

Kelli - I can modify the function to special-case parameters containing
"SizeSel" as well as lnQ such that instead of percentages the bounds are
set to whatever the used inputs.

Using a truncated normal or double normal sounds more elegant, but then we
need to put the burden on the user to do the extra work of calculating a
reasonable SD for each parameter for each stock. A CV would be easier since
then it would presumably be more similar across parameters and life
histories, but then we would still need to special case the negative
parameters since CV's don't really make sense for negative values.

I think in the meantime I like the simpler option of just special casing
the selectivity parameters.

On Mon, Feb 9, 2015 at 9:39 AM, Kelli Johnson notifications@github.com
wrote:

Hello all - I am looking for guidance on how we should standardize the
parameter bounds for: a) parameters that are initialized at zero, b)
negative parameters, c) all other parameters.

In the last study I used a lower bound of 0.5% and an upper bound of 500%
or 1000% for most parameters and a range of -20 to 20 for log catchability.
Using the double normal for the selectivity curves instead of logistic has
added some new complications. Below are some of the parameters in the cod
model (Low, High, INIT). Some of the selectivity parameters are not
estimated, but if we were to attempt to estimate dome-shaped selectivity it
will become an issue. Is there a way we can specify a distribution
(truncated normal or normal) for each parameter and provide a sd to obtain
values from the percentiles as bounds? That way it would not matter if a
parameter was negative, had an INIT of zero, or was bound at zero.

.001 1 .2 # NatM_p_1_Fem_GP_1
.1 200 20 # L_at_Amin_Fem_GP_1
.66 1320 132 # L_at_Amax_Fem_GP_1
.001 1 0.2 # VonBert_K_Fem_GP_1
5e-04 .5 .1 # CV_young_Fem_GP_1
5e-04 .5 .1 # CV_old_Fem_GP_1
.0935 93.5 18.6996 # SR_LN(R0)
0 13.5 2.7 # SizeSel_1P_1_Fishery Fishery PEAK value
-5 3 -1.0 # SizeSel_1P_2_Fishery Fishery TOP logistic
-4 12 0 # SizeSel_1P_3_Fishery Fishery WIDTH exp
-2 75 15.0 # SizeSel_1P_4_Fishery Fishery WIDTH exp
-15 5 -999 # SizeSel_1P_5_Fishery Fishery INIT logistic
-5 5 -999 # SizeSel_1P_6_Fishery Fishery FINAL logistic


Reply to this email directly or view it on GitHub
#10.

Christine Stawitz
Quantitative Ecology and Resource Management
Box 355020
University of Washington
Seattle, WA 98195
cstawitz@uw.edu

http://students.washington.edu/cstawitz/

Christine Stawitz
Quantitative Ecology and Resource Management
Box 355020
University of Washington
Seattle, WA 98195
cstawitz@uw.edu

http://students.washington.edu/cstawitz/

@colemonnahan
Copy link
Contributor

From what I understand, the bounds serve two purposes: (1) Restrict parameters to stabilize the optimization. (2) Restrict parameters to biological/physical meaningful values. The former aids in convergence and the latter affects interpretability of the model. I think some people will consider ours far too wide -- i.e. allow the model to estimate parameters outside of what a scientist would consider plausible. Sometimes they may want to see what the model does, sometimes we want to constrain a parameter to be reasonable to better understand the rest of the model. It really depends on the study.

I think it's valuable to discuss this for our studies, but to be devil's advocate here: I would caution us to not spend too much time with generic/flexible/bulletproof bound setting. In practice the user will run some models and notice that something is up against a bound and expand it if desired (this should be part of the testing phase of a simulation). If they adapt their own model they'll have a good sense of what bounds to use, if they're using our built in ones we will have some set for them.

I see the function as a useful tool, but not one that should alleviate the user from thinking about these issues. I just don't see a way to write a function that is going to work in all cases, meaning that the user will need to take control.

My two cents!

@iantaylor-NOAA
Copy link
Member

Cole, your description of bounds matches my thinking.

I think that in a simulation study you have a choice between wide bounds
that are rarely hit and allow models to wander far from the true value
which then allows model misspecification to have a more noticible effect on
parameter estimates and quantities of interest. Or you can choose narrower
bounds in which case model misspecification won't cause things to go as far
off, but will probably increase the fraction of models that hit the bounds
(which you can then report in the results). I think that the first
approach, which is what you've chosen for these projects, is more
satisfying.

I can discuss choices of bounds for double-normal selectivity some less
busy day but it sounds like Juan stepped up with some advice.

Tedious though these boundless discussions of bounds may seem, I think that
this is one of many under-appreciated subtleties that stock assessment
folks don't think about enough and don't have any best practices for, so
the discussions are definitely useful.
-Ian

On Mon, Feb 9, 2015 at 11:28 AM, Cole Monnahan notifications@github.com
wrote:

From what I understand, the bounds serve two purposes: (1) Restrict
parameters to stabilize the optimization. (2) Restrict parameters to
biological/physical meaningful values. The former aids in convergence and
the latter affects interpretability of the model. I think some people will
consider ours far too wide -- i.e. allow the model to estimate parameters
outside of what a scientist would consider plausible. Sometimes they may
want to see what the model does, sometimes we want to constrain a parameter
to be reasonable to better understand the rest of the model. It really
depends on the study.

I think it's valuable to discuss this for our studies, but to be devil's
advocate here: I would caution us to not spend too much time with
generic/flexible/bulletproof bound setting. In practice the user will run
some models and notice that something is up against a bound and expand it
if desired (this should be part of the testing phase of a simulation). If
they adapt their own model they'll have a good sense of what bounds to use,
if they're using our built in ones we will have some set for them.

I see the function as a useful tool, but not one that should alleviate the
user from thinking about these issues. I just don't see a way to write a
function that is going to work in all cases, meaning that the user will
need to take control.

My two cents!


Reply to this email directly or view it on GitHub
#10 (comment).

@kellijohnson-NOAA
Copy link
Contributor Author

Great discussion. Thanks everyone for participating. I think for right now we should focus on how we want to specify the bounds for the parameters of our models. We can worry about the code for other models later. For instance my use of -20 and 20 for ln(R0) last year was an arbitrary decision. Is this okay and should we use it again. Maybe we can designate someone that knows more about stock assessment to make some rules for us to use so we can get the models going? Any thoughts?

@colemonnahan
Copy link
Contributor

I say we consult @taylori for the Hake model and @juanlvalero for the other two. We can easily check bound issues during our analyses and then come back to them if we need to. I vote that our priority should be to get the 3 new models working so we can move forward and be good to run scenarios next week.

@iantaylor-NOAA
Copy link
Member

Happy to help. Should have more time starting today.

ln(R0) must be positive and after exp is in thousands of fish. I would suggest 4 and 20 as very wide bounds corresponding to about 50 thousand - 500 trillion recruits. But R0 is often well informed anyway so bounds matter less than for selectivity.

@kellijohnson-NOAA
Copy link
Contributor Author

With the new bounds I am getting a warning saying that:

min bound on parameter for size at peak is 5.08; should be >= midsize bin 2 (11.5)
min bound on parameter for size at peak is 5.08; which is < min databin (20), so illogical

Maybe we should use these two conditions as the minimum bound for the first selectivity parameter.

@iantaylor-NOAA
Copy link
Member

This gets to my earlier suggestion that some growth parameters could have
bounds set based on the range of length bins.
The concern when I brought that up was that the range of lengths differs
between species, and potentially even between cases, so it would be
unnecessarily complex.
But if you used the range of population length bins as inputs to the
function that changes the bounds, it would be straightforward to bound the
peak selectivity, Lmin, and Lmax parameters within that range.
Theoretically this could aid convergence.

Or you could just ignore the warnings.
-Ian

On Sat, Feb 21, 2015 at 12:33 PM, Kelli Johnson notifications@github.com
wrote:

With the new bounds I am getting a warning saying that:

min bound on parameter for size at peak is 5.08; should be >= midsize bin
2 (11.5)
min bound on parameter for size at peak is 5.08; which is < min databin
(20), so illogical

Maybe we should use these two conditions as the minimum bound for the
first selectivity parameter.


Reply to this email directly or view it on GitHub
#10 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants