Question to transition probability model #135

dvent7 · 2019-02-27T17:01:49Z

Hi,

Iam really wondering why the base parameter was removed from calculating the transition. I could not find any hinds in the issues.

c74a36f#diff-dede88fbc9079c51b81be7fbf26f4298R285

double transition = (1 / beta)
* Math.exp((-1.0) * route.cost(new TimePriority()) / beta);

The text was updated successfully, but these errors were encountered:

smattheis · 2019-03-02T09:35:48Z

I will come back to your question soon. As I think it's necessary to give you a detailed the answer here, I need to find a bit of time. Today won't work but next days should work. Sorry for the delay.

dvent7 · 2019-03-04T12:00:47Z

Ok, Thank you Sebastian.
I am curious why you changed the alogrithm this way. Take your time.

smattheis · 2019-03-10T17:00:30Z

Approach 1: The algorithm of Newson and Krumm

In Newson and Krumm's paper "Hidden Markov Map Matching Through Noise and Sparseness" [1], a sequence of emissions (position measurements) z₀, z₁, z₂, ... is the input. Each emission z_t consists of position (lat lon) and a time of measurement t. For each emission, we determine a set of state candidates (possible map positions) through a simple radius search given the measured position as the center point:

z₀ -> s_0,0, ..., s_0,N₀
z₁ -> s_1,0, ..., s_1,N₁
z₂ -> s_2,0, ..., s_2,N₂
...

The emission probability is modeled with a Gaussian distribution such that the probability of a candidate s_i,t given the respective measurement z_t is:

where σ_z is typically the standard error of GPS, i.e., 5-8 meters, and the great circle distance is the distance between two positions on the earth surface.

To model movements of objects in a road network, map positions must be connected with paths along the roads of the map denoted as transitions between state candidates and determined with a simple router.

The transition probability is modeled with an exponential distribution, such that the probability of a transition from a state s_t,j to a possible subsequent state s_t+1,i is:

where β is an experimentally determined constant but (!) strongly depends on the sampling interval. The route distance denotes the path length of the shortest path (!) between candidates while base distance is here also the great circle distance of positions on the earth surface.

Now, let's look at some pitfalls:

Pitfall A: The contradiction of shortest path preference with our intuition of road types

Consider an intersection as shown in the figure below where two main roads intersect (red) and where a side road (blue) is a shortcut that goes over rough and smooth such that although it's shorter it takes much longer than the path along the main roads.

Now, assume an object has taken the path along the main roads (green path) and we obtained a sequence of measurements z₀, z₁, z₂, z₃ as shown in the figure. Given the measurements, we can construct a scenario with the following presumptions:

p(s_1,0 | s_0,0) = p(s_1,1 | s_0,0)
p(s_3,0 | s_2,0) = p(s_3,0 | s_2,1)
p(s_1,0 | z₁) = p(s_1,1 | z₁)
p(s_2,0 | z₂) = p(s_2,1 | z₂)

The pitfall arises from the preference of shortest paths in the transition probability model such that

||s_2,0-s_1,0||_route > ||s_2,1-s_1,1||_route (> ||z₂-z₁||_{great circle}) ⇒ p(s_2,0|s_1,0) < p(s_2,1|s_1,1)

which means that s_0,0, s_1,1, s_2,1, s_3,0 is determined as the more likely path although it contradicts the intuition of our example where the object has taken the faster green path.

Pitfall B: The inversion of shortest path preference through absolute values

Consider a road network with several intersections as can be found in residential areas and as illustrated in the figure below.

Now, assume an object has taken the green path and we obtained position measurements z₀, z₁, z₂, z₃ as shown in the figure. Note that the measurement errors of z₁ and z₂ in this example and the shown distances d₀, d₁, and d₂ (perpendicular to the respective roads). Given the measurement errors, we can construct a scenario with the following presumptions:

p(s_1,0|z₁) = p(s_1,1|z₁) (d₀ = d₁)
||s_1,0-s_0,0||_route = ||s_1,1-s_0,0||_route ⇒ p(s_1,0|s_0,0) = p(s_1,1|s_0,0)
||s_2,0-s_1,0||_route < ||s_2,0-s_1,1||_route

The pitfall arises from the large error of z₂ with d₀, d₁ ≪ d₂ such that for presumption 3, we conclude:

||s_2,0-s_1,0||_route < ||s_2,0-s_1,1||_route ⇏ p(s_2,0|s_1,0) > p(s_2,0|s_1,1). (Not follows!)

The reason is the large measurement error of z₂ with ||z₂-z₁||_{great circle} > ||s_2,0-s_1,1||_route, such that it follows the inverse

||s_2,0-s_1,0||_route < ||s_2,0-s_1,1||_route ∧ ||z₂-z₁||_{great circle} > ||s_2,0-s_1,1||_route ⇒ p(s_2,0|s_1,0) < p(s_2,0|s_1,1)

which means that s_0,0, s_1,1, s_2,0, s_3,0 is determined as the more likely path although it contradicts the preference of shorter paths.

Approach 2: A quick-fix

To avoid pitfall A, we use a router that takes road types into account. Since Barefoot uses OpenStreetMap data, we followed the practices of well-known Open Source routers for OpenStreetMap. More specifically, we follow some principles of pgrouting and use a cost function that calculates the driving time t and multiplies it with a factor f that represents the road types. In detail, for each segment of the path we use the respective length of the segment s and its maximum speed v, calculate the time t (with t = s / v) and multiply it with a factor f ≥ 1 that is predefined for all types of roads, where major roads have a lower value and minor roads have a larger value, e.g., highways are assigned factor f = 1 and roads in residential areas are assigned f = 2.5.

This alternative cost function, however, requires also an alternative base distance that corresponds to the cost function. For that reason, we calculate the base distance similarly to the cost function but use the great circle distance between measurements, assume a maximum speed of 216 km/h (= 60 m/s) and the minimum road type factor of 1.

To avoid pitfall B, we also adapted the transition probability model to not use the absolute value for the difference between base distance and route distance but bound the difference to a minimum of zero, in the case that ||s_t+1,i-s_t,j||_route < ||z_t+1-z_t||_{great circle}, as follows:

This algorithm solves the problems of pitfall A but only some cases of problems of pitfall B. To be specific, reconsider the example of pitfall B. The algorithm will not obtain p(s_2,0|s_1,0) < p(s_2,0|s_1,1) but instead p(s_2,0|s_1,0) = p(s_2,0|s_1,1). That fixes the problems for some cases but it still can annul the preference of lower route costs because although ||s_2,0-s_1,0||_route < ||s_2,0-s_1,1||_route we can construct a scenario where we obtain p(s_2,0|s_1,0) = p(s_2,0|s_1,1) and observe:

||s_2,0-s_1,0||_route < ||s_2,0-s_1,1||_route ⇏ p(s_2,0|s_1,0) > p(s_2,0|s_1,1). (Not follows!)

The reason is simple: The difference of route costs in such cases is "absorbed" by the base distance!

The consequence is quite inadequate because the example of pitfall B can easily degenerate to cases where p(s_1,0|z₁) < p(s_1,1|z₁) (d₀ > d₁) such that also here we obtain again s_0,0, s_1,1, s_2,0, s_3,0 as the more likely path although it contradicts the preference of lower route distances. (Note, although such cases may seem artificial we've seen them occur. Please remember Murphy's law.)

Approach 3: A more robust solution

The solution we found for our use cases is as simple as elegant: The base distance is simply not necessary. This means, the transition probability describes an exponential distribution of the route distance defined as follows

where the route cost function is the same as for algorithm 2 and the expected value β is given with the time between the measurements (multiplied with factor 1).

The intuition is that the route distance that can be traveled within a given time interval follows an exponential distribution. Seen from a different angle, we describe the probability of route distance (time-based) between measurements which describes the original problem domain of the exponential distribution. Of course, one could argue for and against its applicability or the validity of the Poisson assumptions, required to assume a Poisson process and to justify the use of an exponential distribution. However, note that we observed in our data that cars spend lots of time waiting at traffic lights, in traffic jams, in parking scenarios, etc. Given that, there are good arguments to consider all Poisson assumptions to be satisfied such that we consider the exponential distribution to be a robust description of the route distance distribution.

A side effect for this approach, since β is given with the measurement interval we gain an inherent self-adaptability of the algorithm to varying sampling rates.

smattheis pinned this issue Mar 11, 2019

smattheis changed the title ~~Question to transition calculation~~ Question to transition probability model Mar 11, 2019

smattheis added the question label Mar 11, 2019

smattheis mentioned this issue Jan 26, 2020

Using time difference in computation of transmission probabilities bmwcarit/offline-map-matching#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question to transition probability model #135

Question to transition probability model #135

dvent7 commented Feb 27, 2019

smattheis commented Mar 2, 2019

dvent7 commented Mar 4, 2019

smattheis commented Mar 10, 2019 •

edited

Question to transition probability model #135

Question to transition probability model #135

Comments

dvent7 commented Feb 27, 2019

smattheis commented Mar 2, 2019

dvent7 commented Mar 4, 2019

smattheis commented Mar 10, 2019 • edited

Approach 1: The algorithm of Newson and Krumm

Pitfall A: The contradiction of shortest path preference with our intuition of road types

Pitfall B: The inversion of shortest path preference through absolute values

Approach 2: A quick-fix

Approach 3: A more robust solution

smattheis commented Mar 10, 2019 •

edited