Incorporating known locations #46

camrinbraun · 2021-03-06T17:20:17Z

The current logic in HMMoce incorporates known locations by "fixing" the grid cell containing the known location to 1 and setting all other cells for a given time step to 0. This effectively incorporates the known location in the likelihood calculation. However, when the filter and smoother processes are performed, the resulting posteriors will almost certainly contain the known grid cell but may not set that cell as known in the final track.

use make.L() to build likelihoods. Known locations are incorporated as fixed grid cells with likelihood = 1
when hmm.filter() operates on the overall L likelihood, the known grid cell for a given time step will inform the posterior distributions but not necessarily fix them to match the known location (same for hmm.smoother())
calc.track() currently uses the max of each step in the smoother output as the assigned track position for that step

Thus, the issue is that calc.track() does not guarantee a given step of an output track matches the known location at that step.

Potential (pseudo-code) solution:

## L is built as normal, across the whole dataset
L <- make.L(L.res$L.rasters, iniloc = iniloc, dateVec = dateVec)

## data frame of known locations (by row)
known.locs <- data.frame()
## combine with start and end locations
known.locs <- rbind(iniloc[1,], known.locs, iniloc[2,])
known.idx <- findInterval(known.locs$Date, dateVec)

## split the HMM process into discrete trajectories that are anchored at start/end by known locations
## iterate through each trajectory
s_all <- list()
for (i in xxx){
  f <- hmm.filter(g = L.res$g, L = L[,,traj_idx], K = K, P = P, m = 2)
  s <- hmm.smoother(f, K = K, L = L[,,traj_idx], P = P)
  s_all[[i]] <- s
}

## append s_all back into a single smoother output by concatenating across trajectories
s <- s_all[some_combine_idx]

## should be able to combine across smoother outputs, s, and calc.track at the end
tr <- calc.track(s, g = L.res$g, dateVec, iniloc, method='mean')

Potential issues:

how to deal with parameter optimization? do we have to fix them a priori?
@galuardi can this be dealt with more explicitly if we implemented the viterbi algorithm for calc.track()?

The text was updated successfully, but these errors were encountered:

camrinbraun · 2021-03-06T17:20:50Z

what do you think @galuardi @marosteg

galuardi · 2021-03-06T17:40:41Z

You could substitute the likelihood for known locations days in the final posterior pdf. Probably very easy fix

galuardi · 2021-03-06T17:41:46Z

Although the track split would be better for uncertainty.

marosteg · 2021-03-07T15:21:09Z

If following the first approach ('substitute the likelihood for known location days in the final posterior pdf'), would the best way to do this be to incorporate known location days in make.L(..., known.locs = ***) as well as via substitution in the output of hmm.smoother() prior to use of calc.track()? That way, the model is guided by those known location days during its construction and then the resulting guided track is later forced through those substituted points.

For the second approach, could you elaborate on the benefit to uncertainty of modeling the track in segments and then conjoining them? I wonder about parameter optimization for this approach, given that the track as a whole may contain both behavioral states but a given segment may only contain one, meaning that some segments would ideally be run as a two-state and others as one-state (and require knowing which to use).

camrinbraun · 2021-03-08T15:09:36Z

If following the first approach ('substitute the likelihood for known location days in the final posterior pdf'), would the best way to do this be to incorporate known location days in make.L(..., known.locs = ***) as well as via substitution in the output of hmm.smoother() prior to use of calc.track()? That way, the model is guided by those known location days during its construction and then the resulting guided track is later forced through those substituted points.

Yeah this is worth a try and will probably work fine although may not be explicitly kosher from a modeling perspective.

For the second approach, could you elaborate on the benefit to uncertainty of modeling the track in segments and then conjoining them? I wonder about parameter optimization for this approach, given that the track as a whole may contain both behavioral states but a given segment may only contain one, meaning that some segments would ideally be run as a two-state and others as one-state (and require knowing which to use).

In this case, you're explicitly including the known locations throughout the modeling and track construction process (by splitting into segments/trajectories). So there would be no ad hoc "substitution" of knowns in the final posterior.

Re: parameters -> what if you performed a full model run using the whole dataset, adding known locations in make.L(), for parameter estimation? This is a totally reasonable approach for calculating the posteriors and should work just fine for getting your parameters. Then you could apply those parameters to the segmented model run that would result in your final track.

galuardi · 2021-03-08T15:27:37Z

The more I thought about it the less I like the ad-hoc substitution (think before you type, ben!). It will certainly work in giving you the known locations at the correct times, but it would probably create nonsense in between them. I could forsee jumps in space if a trajectory veers away from the known point a bit only to jump back to the correct location. The track splitting would be the best for dealing with all aspects of the outputs.

Other methods have dealt with this successfully; BSAM for Argos locations ( not a grid based method), and this paper

Strøm, J.F., Thorstad, E.B., Chafe, G., Sørbye, S.H., Righton, D., Rikardsen, A.H., and Carr, J. 2017. Ocean migration of pop-up satellite archival tagged Atlantic salmon from the Miramichi River in Canada. ICES J Mar Sci 74(5): 1356–1370. doi:10.1093/icesjms/fsw220.

which was an hmm model version that used known positions. no code available though.

camrinbraun · 2021-03-08T16:09:51Z

Other methods have dealt with this successfully; BSAM for Argos locations ( not a grid based method), and this paper

Strøm, J.F., Thorstad, E.B., Chafe, G., Sørbye, S.H., Righton, D., Rikardsen, A.H., and Carr, J. 2017. Ocean migration of pop-up satellite archival tagged Atlantic salmon from the Miramichi River in Canada. ICES J Mar Sci 74(5): 1356–1370. doi:10.1093/icesjms/fsw220.

which was an hmm model version that used known positions. no code available though.

Agreed. The Strom paper is an excellent suggestion. @marosteg I'd say give this trajectory idea a shot and we'll go from there. I know J Carr well enough to ask him about their approach in this Atl salmon paper. Would be good to have some prelim results from our trajectory attempts before i ask him though. And if you want to test this on a different track, we do have known locations for a lot of blue sharks, including the individual used as the example data in HMMoce (141259 I think).

marosteg · 2021-03-08T16:45:41Z

I just looked through the Strøm paper but it is unclear how the acoustic detections were incorporated as known locations. It doesn't discern whether they were supplied as input likelihoods for model construction, ad-hoc substitutions in the posterior, or start/end points for modeled segments that were then conjoined. The only comment to this effect is, " [Acoustic tracking at entry and exit of Gulf of St. Lawrence] was done to increase the number of known positions independent of the PSAT data, and decrease the uncertainty of the geolocation model".

galuardi · 2021-03-08T17:10:17Z

Without code, it's hard to say exactly how they did it. I has always hoped they would put out their own R package. I met Strom and he indicated they would, but that was in 2015..

My guess is they did it via likelihood. They have a pretty small grid (10km), and the bathymetry in that area provides nice constraints. Other than that, it's essentially an R rewrite of GPE3 (which was a rewrite of Pedersen 2011).

other notes; They did use MTI raw geolocations for lat/long likelihood. Most of the lats probably got removed judging by their methods and my experience with them..

camrinbraun added the question label Mar 6, 2021

marosteg mentioned this issue Nov 1, 2021

Separate Lat and Lon Likelihood Surfaces #63

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorporating known locations #46

Incorporating known locations #46

camrinbraun commented Mar 6, 2021 •

edited

camrinbraun commented Mar 6, 2021

galuardi commented Mar 6, 2021

galuardi commented Mar 6, 2021

marosteg commented Mar 7, 2021

camrinbraun commented Mar 8, 2021

galuardi commented Mar 8, 2021

camrinbraun commented Mar 8, 2021

marosteg commented Mar 8, 2021

galuardi commented Mar 8, 2021

Incorporating known locations #46

Incorporating known locations #46

Comments

camrinbraun commented Mar 6, 2021 • edited

camrinbraun commented Mar 6, 2021

galuardi commented Mar 6, 2021

galuardi commented Mar 6, 2021

marosteg commented Mar 7, 2021

camrinbraun commented Mar 8, 2021

galuardi commented Mar 8, 2021

camrinbraun commented Mar 8, 2021

marosteg commented Mar 8, 2021

galuardi commented Mar 8, 2021

camrinbraun commented Mar 6, 2021 •

edited