Example code creating an error in 'stop_I' for stops #17

BeishuizenTimKPMG · 2019-03-04T15:01:30Z

The stop ids in the example are taken from the index in the pandas dataframe instead of the 'stop_I' column. This is not a problem in the Finnish dataset, but not all ids are the same the index number. (for example. the Dutch public transport dataset from www.openOV.nl starts with 1 instead of 0)

Code change proposals:

stop_dict = G.stops().to_dict("index")
for stop_I, data in stop_dict.items():
    if data['name'] == from_stop_name:
        from_stop_I = stop_I
    if data['name'] == to_stop_name:
        to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)

TO:

stop_data = OV_data.stops()

from_stop_I = stop_data[stop_data['name'] == from_stop_name].stop_I.values[0]
to_stop_I = stop_data[stop_data['name'] == to_stop_name].stop_I.values[0]

AND FROM:

stop_dict = G.stops().to_dict("index")
print("Origin: ", stop_dict[from_stop_I])
print("Destination: ", stop_dict[to_stop_I])

TO:

stop_data = OV_data.stops()
print("Origin: ", stop_data[stop_data['stop_I'] == from_stop_I])
print("Destination: ", stop_data[stop_data['stop_I'] == to_stop_I])

The text was updated successfully, but these errors were encountered:

rmkujala · 2019-03-05T08:11:21Z

Which file are you referring to here?

Please provide also the (necessary) part for reproducing your problem & the error message that occurred.

Thanks!

BeishuizenTimKPMG · 2019-03-05T08:45:50Z

The code is directly taken from the example in " gtfspy/examples/example_temporal_distance_profile.py". The code snippets are from line 15 - 22 and 69 - 71.

The problem can be seen using the first snippet of code (I called my rtfs object OV_data instead of G, which is not mentioned in previous comment, my apologies). No error message is present in this bug, however when adding a check in there for mismatches it becomes clear:

stop_dict = OV_data.stops().to_dict("index")
for stop_I, data in stop_dict.items():
    if stop_I != data['stop_I']:
        print('The index number is different from the actual ID')
        print('Index: ' + str(stop_I))
        print('Stop_I: ' + str(data['stop_I']))
        break
    if data['name'] == from_stop_name:
        from_stop_I = stop_I
    if data['name'] == to_stop_name:
        to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)

This clearly indicates the following off-by-1 bug in the code:

The index number is different from the actual ID
Index: 0
Stop_I: 1

Therefore the wrong stops are taken as start and end stop in the example.

Gtfs itself directly references towards the "stop_I" key, therefore using this key directly from pandas resolves the issue.

I hope I made it more clear, if not I am open for answering additional questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example code creating an error in 'stop_I' for stops #17

Example code creating an error in 'stop_I' for stops #17

BeishuizenTimKPMG commented Mar 4, 2019

rmkujala commented Mar 5, 2019

BeishuizenTimKPMG commented Mar 5, 2019 •

edited

Example code creating an error in 'stop_I' for stops #17

Example code creating an error in 'stop_I' for stops #17

Comments

BeishuizenTimKPMG commented Mar 4, 2019

rmkujala commented Mar 5, 2019

BeishuizenTimKPMG commented Mar 5, 2019 • edited

BeishuizenTimKPMG commented Mar 5, 2019 •

edited