Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unexpected behaviour with speed-based filtering #280

Open
ghost opened this issue Feb 1, 2024 · 0 comments
Open

An unexpected behaviour with speed-based filtering #280

ghost opened this issue Feb 1, 2024 · 0 comments

Comments

@ghost
Copy link

ghost commented Feb 1, 2024

Hi!

I started using your library lately and came across an unexpected behaviour in speed-based filtering function: https://github.com/scikit-mobility/scikit-mobility/blob/master/skmob/preprocessing/filtering.py#L115

Consider a case where we move along the equator. For the simplicity sake the trajectory points are reported at 1 hour intervals. The trajectory is as follows:

  • starting at lat=0 deg and lon=0 deg
  • flying to lat=0 deg and lon=7.186609624 deg, i.e. 800 km/h w.r.t. the previous point
  • driving to lat=0 deg and lon=7.5458541615 deg, i.e. 40 km/h w.r.t. the previous point
  • continue the trajectory along the equator at 40 km/h.

With a speed limit set at 50 km/h it is expected to remove only the second point, where the limit was exceeded.

This scenario is represented with the following code snippet

from skmob.preprocessing.filtering import _filter_array
import numpy as np
import pandas as pd

degs_800_km = 7.186609624  # math.degrees(800/6371)
degs_40_km = 0.3592445375  # math.degrees(40/6371)

 lat_lng_time = ([np.array([0.0, 0.0, pd.Timestamp('2008-10-23 05:00:00')])]
            + [np.array([0.0, degs_800_km + i*degs_40_km, pd.Timestamp('2008-10-23 06:00:00') + pd.Timedelta(hours=i)]) for i in range(10)])

_filter_array(lat_lng_time, max_speed=50, include_loops=False, speed=50., max_loop=6, ratio_max=0.25)

Out: 
[array([0.0, 0.0, Timestamp('2008-10-23 06:00:00')]),
 array([0.0, 10.4198104615, Timestamp('2008-10-23 15:00:00')])]

Which does not meet the expectation.

Having studied the code, it appears that the continue in line https://github.com/scikit-mobility/scikit-mobility/blob/master/skmob/preprocessing/filtering.py#L142 is responsible for this behaviour.

Let's follow the execution of the loop:

  • i = 0, compare i+1 with i, the speed is 800 km/h, above the limit, remove i+1 point, continue without incrementing index i
  • i = 0, compare i+1 with I (so i=2 and i=0 of the original array), the speed is 420 km/h, above the limit, remove i+1 point, continue without incrementing index I
  • i = 0, compare i+1 with I (so i=3 and i=0 of the original array), the speed is 293 km/h, above the limit, remove i+1 point, continue without incrementing index I
  • ...

It seems that the filtering algorithm was designed to eliminate singular points that moved at high speed. However, the same method fails if the high-speed movement was permanent, e.g., flying or traveling on a highway.

Could you please confirm my findings? If that is the case, then I'll happily work on the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants