Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporating known locations #46

Open
camrinbraun opened this issue Mar 6, 2021 · 9 comments
Open

Incorporating known locations #46

camrinbraun opened this issue Mar 6, 2021 · 9 comments
Labels

Comments

@camrinbraun
Copy link
Owner

camrinbraun commented Mar 6, 2021

The current logic in HMMoce incorporates known locations by "fixing" the grid cell containing the known location to 1 and setting all other cells for a given time step to 0. This effectively incorporates the known location in the likelihood calculation. However, when the filter and smoother processes are performed, the resulting posteriors will almost certainly contain the known grid cell but may not set that cell as known in the final track.

  • use make.L() to build likelihoods. Known locations are incorporated as fixed grid cells with likelihood = 1
  • when hmm.filter() operates on the overall L likelihood, the known grid cell for a given time step will inform the posterior distributions but not necessarily fix them to match the known location (same for hmm.smoother())
  • calc.track() currently uses the max of each step in the smoother output as the assigned track position for that step

Thus, the issue is that calc.track() does not guarantee a given step of an output track matches the known location at that step.

Potential (pseudo-code) solution:

## L is built as normal, across the whole dataset
L <- make.L(L.res$L.rasters, iniloc = iniloc, dateVec = dateVec)

## data frame of known locations (by row)
known.locs <- data.frame()
## combine with start and end locations
known.locs <- rbind(iniloc[1,], known.locs, iniloc[2,])
known.idx <- findInterval(known.locs$Date, dateVec)

## split the HMM process into discrete trajectories that are anchored at start/end by known locations
## iterate through each trajectory
s_all <- list()
for (i in xxx){
  f <- hmm.filter(g = L.res$g, L = L[,,traj_idx], K = K, P = P, m = 2)
  s <- hmm.smoother(f, K = K, L = L[,,traj_idx], P = P)
  s_all[[i]] <- s
}

## append s_all back into a single smoother output by concatenating across trajectories
s <- s_all[some_combine_idx]

## should be able to combine across smoother outputs, s, and calc.track at the end
tr <- calc.track(s, g = L.res$g, dateVec, iniloc, method='mean')

Potential issues:

  • how to deal with parameter optimization? do we have to fix them a priori?
  • @galuardi can this be dealt with more explicitly if we implemented the viterbi algorithm for calc.track()?
@camrinbraun
Copy link
Owner Author

what do you think @galuardi @marosteg

@galuardi
Copy link
Collaborator

galuardi commented Mar 6, 2021

You could substitute the likelihood for known locations days in the final posterior pdf. Probably very easy fix

@galuardi
Copy link
Collaborator

galuardi commented Mar 6, 2021

Although the track split would be better for uncertainty.

@marosteg
Copy link
Collaborator

marosteg commented Mar 7, 2021

If following the first approach ('substitute the likelihood for known location days in the final posterior pdf'), would the best way to do this be to incorporate known location days in make.L(..., known.locs = ***) as well as via substitution in the output of hmm.smoother() prior to use of calc.track()? That way, the model is guided by those known location days during its construction and then the resulting guided track is later forced through those substituted points.

For the second approach, could you elaborate on the benefit to uncertainty of modeling the track in segments and then conjoining them? I wonder about parameter optimization for this approach, given that the track as a whole may contain both behavioral states but a given segment may only contain one, meaning that some segments would ideally be run as a two-state and others as one-state (and require knowing which to use).

@camrinbraun
Copy link
Owner Author

If following the first approach ('substitute the likelihood for known location days in the final posterior pdf'), would the best way to do this be to incorporate known location days in make.L(..., known.locs = ***) as well as via substitution in the output of hmm.smoother() prior to use of calc.track()? That way, the model is guided by those known location days during its construction and then the resulting guided track is later forced through those substituted points.

Yeah this is worth a try and will probably work fine although may not be explicitly kosher from a modeling perspective.

For the second approach, could you elaborate on the benefit to uncertainty of modeling the track in segments and then conjoining them? I wonder about parameter optimization for this approach, given that the track as a whole may contain both behavioral states but a given segment may only contain one, meaning that some segments would ideally be run as a two-state and others as one-state (and require knowing which to use).

In this case, you're explicitly including the known locations throughout the modeling and track construction process (by splitting into segments/trajectories). So there would be no ad hoc "substitution" of knowns in the final posterior.

Re: parameters -> what if you performed a full model run using the whole dataset, adding known locations in make.L(), for parameter estimation? This is a totally reasonable approach for calculating the posteriors and should work just fine for getting your parameters. Then you could apply those parameters to the segmented model run that would result in your final track.

@galuardi
Copy link
Collaborator

galuardi commented Mar 8, 2021

The more I thought about it the less I like the ad-hoc substitution (think before you type, ben!). It will certainly work in giving you the known locations at the correct times, but it would probably create nonsense in between them. I could forsee jumps in space if a trajectory veers away from the known point a bit only to jump back to the correct location. The track splitting would be the best for dealing with all aspects of the outputs.

Other methods have dealt with this successfully; BSAM for Argos locations ( not a grid based method), and this paper

Strøm, J.F., Thorstad, E.B., Chafe, G., Sørbye, S.H., Righton, D., Rikardsen, A.H., and Carr, J. 2017. Ocean migration of pop-up satellite archival tagged Atlantic salmon from the Miramichi River in Canada. ICES J Mar Sci 74(5): 1356–1370. doi:10.1093/icesjms/fsw220.

which was an hmm model version that used known positions. no code available though.

@camrinbraun
Copy link
Owner Author

Other methods have dealt with this successfully; BSAM for Argos locations ( not a grid based method), and this paper

Strøm, J.F., Thorstad, E.B., Chafe, G., Sørbye, S.H., Righton, D., Rikardsen, A.H., and Carr, J. 2017. Ocean migration of pop-up satellite archival tagged Atlantic salmon from the Miramichi River in Canada. ICES J Mar Sci 74(5): 1356–1370. doi:10.1093/icesjms/fsw220.

which was an hmm model version that used known positions. no code available though.

Agreed. The Strom paper is an excellent suggestion. @marosteg I'd say give this trajectory idea a shot and we'll go from there. I know J Carr well enough to ask him about their approach in this Atl salmon paper. Would be good to have some prelim results from our trajectory attempts before i ask him though. And if you want to test this on a different track, we do have known locations for a lot of blue sharks, including the individual used as the example data in HMMoce (141259 I think).

@marosteg
Copy link
Collaborator

marosteg commented Mar 8, 2021

I just looked through the Strøm paper but it is unclear how the acoustic detections were incorporated as known locations. It doesn't discern whether they were supplied as input likelihoods for model construction, ad-hoc substitutions in the posterior, or start/end points for modeled segments that were then conjoined. The only comment to this effect is, " [Acoustic tracking at entry and exit of Gulf of St. Lawrence] was done to increase the number of known positions independent of the PSAT data, and decrease the uncertainty of the geolocation model".

@galuardi
Copy link
Collaborator

galuardi commented Mar 8, 2021

Without code, it's hard to say exactly how they did it. I has always hoped they would put out their own R package. I met Strom and he indicated they would, but that was in 2015..

My guess is they did it via likelihood. They have a pretty small grid (10km), and the bathymetry in that area provides nice constraints. Other than that, it's essentially an R rewrite of GPE3 (which was a rewrite of Pedersen 2011).

other notes; They did use MTI raw geolocations for lat/long likelihood. Most of the lats probably got removed judging by their methods and my experience with them..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants