Question about model error structure #354

tsebens · 2022-12-13T21:26:17Z

Less of an issue and more of a clarification: When I use e to specify subsets of the data which should be fit to using multiple separate error distributions (which are then specified in ObsModel) how does VAST handle this? Does it split the dataset and fit multiple instances of the model, or does it fit a single model but with a likelihood function which can adopt different outputs depending on the data subset?

agruss2 · 2022-12-14T08:20:36Z

Hello,

When you fit a VAST model to multiple data types and, therefore, need to specify multiple distribution models in VAST, you need to:
(1) Include a “Data_type” column in your dataset.
(2) Modify the “ObsModel” object in VAST so that it includes several rows instead of just one.
(3) Include a data type catchability factor in the first linear predictor of your VAST model (preferably specified as a fixed effect):

catchability_data = my_dataset[,'Data_type',drop = FALSE]
Q1_formula = ~ factor( Data_type )

Let us see how things work in practice:
(1) If you work only with biomass-sampling data (or any data type that can take any non-negative real number), then you will not need any “Data_type” column in your dataset; and you will set ObsModel to c( 2, 1 )
(2) If you work with biomass-sampling data and count data (or any data type that can take any positive integer), then you will need to: include a “Data_type” column in your dataset, with levels “Count” and “Biomass”, in this order; and set ObsModel to cbind( c( 14, 2 ), 1 )
(3) If you work with biomass-sampling data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter” and “Biomass”, in this order; and set ObsModel to cbind( c( 13, 2 ), 1 )
(4) If you work with count data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter” and “Count”, in this order; and set ObsModel to cbind( c( 13, 14 ), 1 )
(5) If you work with biomass-sampling data, count data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter”, “Count” and “Biomass”, in this order; and set ObsModel to cbind( c( 13, 14, 2 ), 1 )

When your VAST model is fitted to multiple data types, the likelihoods for the different data types (e.g., encounters/non-encounters, counts and biomass-sampling data) have parameters in common since you are using a Poisson-link delta model (as you specified ObsModel[,2] as being equal to 1). Consequently, only one single VAST model is fitted to all the data and the likelihood of your VAST model fitted to multiple data types is obtained as the product of the likelihoods for the different individual data types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about model error structure #354

Question about model error structure #354

tsebens commented Dec 13, 2022

agruss2 commented Dec 14, 2022

Question about model error structure #354

Question about model error structure #354

Comments

tsebens commented Dec 13, 2022

agruss2 commented Dec 14, 2022