Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfers counted incorrectly #88

Open
szaboildi opened this issue Nov 4, 2021 · 11 comments
Open

Transfers counted incorrectly #88

szaboildi opened this issue Nov 4, 2021 · 11 comments
Labels
bug Something isn't working

Comments

@szaboildi
Copy link

Hi! I found a bug with how transfers are counted.
I am using the new Sept 2021 - Dec 2021 HVV GTFS found here.
I am using gtfsrouter 0.0.5.042 with R version 4.0.5
When I read this file in

library(gtfsrouter)
library(data.table)

url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- tempfile()
download.file(url, file, quiet = TRUE)

from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- c(8, 0)
day_trip <- 3 # Tuesday

ham_gtfs <- extract_gtfs(file)
# v Unzipped GTFS archive  
# v Extracted GTFS feed 
# v Converted stop times to seconds 
# v Converted transfer times to seconds

gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0)
#    route_name                      trip_name                      stop_name arrival_time departure_time
# 1         292 Lufthansa-Basis (Haupteingang)              S Hamburg Airport     08:03:00       08:03:00
# 2         292 Lufthansa-Basis (Haupteingang)                Zeppelinstraße     08:05:00       08:05:00
# 3         292 Lufthansa-Basis (Haupteingang)                Röntgenstraße     08:06:00       08:06:00
# 4          23               U Niendorf Markt Lufthansa-Basis (Haupteingang)     08:12:00       08:12:00
# 5          23               U Niendorf Markt                   Paeplowstieg     08:14:00       08:14:00
# 6          23               U Niendorf Markt             Am Licentiatenberg     08:15:00       08:15:00
# 7          23               U Niendorf Markt                     Spreenende     08:17:00       08:17:00
# 8          23               U Niendorf Markt                    Warnckesweg     08:19:00       08:19:00
# 9          23               U Niendorf Markt             Stavenhagenstraße     08:20:00       08:20:00
# 10         23               U Niendorf Markt                      Bekstück     08:21:00       08:21:00
# 11         23               U Niendorf Markt             Niendorfer Straße     08:22:00       08:22:00
# 12         23               U Niendorf Markt               Vogt-Cordes-Damm     08:24:00       08:24:00
# 13        191               Krohnstiegtunnel               U Niendorf Markt     08:35:00       08:35:00
# 14        191               Krohnstiegtunnel               U Niendorf Markt     08:36:00       08:36:00
# 15        191               Krohnstiegtunnel             Fuhlsbütteler Weg     08:38:00       08:38:00
# 16        191               Krohnstiegtunnel                   Helvetierweg     08:39:00       08:39:00
# 17        191               Krohnstiegtunnel                     Krähenweg     08:40:00       08:40:00
# 18        191               Krohnstiegtunnel                 Langobardenweg     08:42:00       08:42:00

This issue persists even when gtfs_route() is run on a timetable and not the gtfs itself.

ham_ttable <- gtfs_timetable(ham_gtfs, day)
gtfs_route(ham_ttable , from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0)

#    route_name                      trip_name                      stop_name arrival_time departure_time
# 1         292 Lufthansa-Basis (Haupteingang)              S Hamburg Airport     08:03:00       08:03:00
# 2         292 Lufthansa-Basis (Haupteingang)                Zeppelinstraße     08:05:00       08:05:00
# 3         292 Lufthansa-Basis (Haupteingang)                Röntgenstraße     08:06:00       08:06:00
# 4          23               U Niendorf Markt Lufthansa-Basis (Haupteingang)     08:12:00       08:12:00
# 5          23               U Niendorf Markt                   Paeplowstieg     08:14:00       08:14:00
# 6          23               U Niendorf Markt             Am Licentiatenberg     08:15:00       08:15:00
# 7          23               U Niendorf Markt                     Spreenende     08:17:00       08:17:00
# 8          23               U Niendorf Markt                    Warnckesweg     08:19:00       08:19:00
# 9          23               U Niendorf Markt             Stavenhagenstraße     08:20:00       08:20:00
# 10         23               U Niendorf Markt                      Bekstück     08:21:00       08:21:00
# 11         23               U Niendorf Markt             Niendorfer Straße     08:22:00       08:22:00
# 12         23               U Niendorf Markt               Vogt-Cordes-Damm     08:24:00       08:24:00
# 13        191               Krohnstiegtunnel               U Niendorf Markt     08:35:00       08:35:00
# 14        191               Krohnstiegtunnel               U Niendorf Markt     08:36:00       08:36:00
# 15        191               Krohnstiegtunnel             Fuhlsbütteler Weg     08:38:00       08:38:00
# 16        191               Krohnstiegtunnel                   Helvetierweg     08:39:00       08:39:00
# 17        191               Krohnstiegtunnel                     Krähenweg     08:40:00       08:40:00
# 18        191               Krohnstiegtunnel                 Langobardenweg     08:42:00       08:42:00

This issue is there in the traveltime table as well

ttimes <- gtfs_traveltimes(ham_gtfs, from_trip,
                           start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), 
                           day = day, max_traveltime = 60 * 60, max)
ttimes[ttimes$stop_name == "Langobardenweg" & ttimes$ntransfers == 0,]
#      start_time duration ntransfers                stop_id      stop_name stop_lon stop_lat
# 1317   08:03:00 00:39:00          0 de:02000:91028::910093 Langobardenweg 9.966597 53.63475

Is this a bug or could this be an issue with the particular GTFS somehow?

@mpadge
Copy link
Member

mpadge commented Jan 30, 2022

Hi @szaboildi, sorry i somehow completely missed this issue. I'm not sure what you mean with the bug here; could you please explain a bit more? You show journeys with transfers, and they both match. Can you please explain why this is different from what you expect? Thanks

@szaboildi
Copy link
Author

Hi @mpadge , thanks for getting back to me! The issue is basically that this trip shows up as requiring 0 transfers (the max_transfers parameter is set to 0 in the gtfs_route() function at the top and thentransfers column in ttimes at the bottom is also 0) even though when details of the trip are accessed, it seems like it involves 3 routes (292, 23, 191-- in the route_name column) -- which actually matches the fastest route on google maps.

(The output of the gtfs_traveltimes() function in this case is likely the same trip that the gtfs_route() function finds, because the starting time and duration matches.)

@mpadge
Copy link
Member

mpadge commented Jan 31, 2022

But the documentation for the gtfs_route function explains the max_transfers parameter like this:

If not NA, specify a desired maximum number of transfers for the route (including but not exceeding this number). This parameter may be used to generate alternative routes with fewer transfers, although actual numbers of transfers may still exceed this number if a value is specified which exceeds the minimal feasible number of transfers.

Setting max_transfers = N only has any effect if it is possible to travel along a route with that number of transfers, in which case it will ensure that trips with any greater number of transfers are not returned, even where these may be faster. In your case, 0 transfers are simply impossible, and you get by default the minimal number which is 2.

@mpadge
Copy link
Member

mpadge commented Jan 31, 2022

I probably should update that documentation to change

if a value is specified which exceeds the minimal feasible number of transfers.

to the correct version of

if the value specified is less than the minimal feasible number of transfers.

@szaboildi
Copy link
Author

Thank you for the explanation, that explains the gtfs_route() part of the problem perfectly! Why does the gtfs_traveltimes() function return this trip with 0 transfers though? Shouldn't the query in the last lineline (below) result in an empty df?

ttimes <- gtfs_traveltimes(ham_gtfs, from_trip, start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), day = day, max_traveltime = 60 * 60, max) ttimes[ttimes$stop_name == "Langobardenweg" & ttimes$ntransfers == 0,]

@mpadge
Copy link
Member

mpadge commented Feb 8, 2022

Yes, that gtfs_travetimes result does possibly reflect a bug. I'll keep this issue open and investigate that further soon. Thanks

@mpadge mpadge added the bug Something isn't working label Aug 16, 2022
@mpadge mpadge closed this as completed in 69e5406 Aug 16, 2022
@mpadge
Copy link
Member

mpadge commented Aug 16, 2022

@szaboildi Sorry that it's taken so long to get back to this. The above commit fixes the bug you uncovered, and you should now see this:

library (gtfsrouter)
packageVersion ("gtfsrouter")
#> [1] '0.0.5.91'

url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- "./feeds/HVV.zip"
if (!file.exists (file)) {
    download.file(url, file, quiet = FALSE)
|

from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- 8 * 3600
day_trip <- 3 # Tuesday

ham_gtfs <- extract_gtfs(file)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive  
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed 
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds 
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
ham_gtfs <- gtfs_transfer_table (ham_gtfs)
route <- gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0)
# same as above: lists 2 transfers, and takes 39 minutes

ttimes <- gtfs_traveltimes(ham_gtfs, from_trip,
                           start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), 
                           day = day_trip, max_traveltime = 60 * 60)
ttimes[ttimes$stop_name == to_trip, ]
#>      start_time duration ntransfers                stop_id      stop_name
#> 1163   08:03:00 00:39:00          2 de:02000:91028::910093 Langobardenweg
#> 1168   08:03:00 00:36:00          4 de:02000:91028::910098 Langobardenweg
#>      stop_lon stop_lat
#> 1163 9.966597 53.63475
#> 1168 9.966927 53.63527

Created on 2022-08-16 by the reprex package (v2.0.1)

That now lists the correct number of transfers, along with an additional possibility to get there 3 minutes quicker with 4 transfers.

Description of bug

The traveltimes algorithm was not counting "implicit" transfers to different services (that is, "trip_id" values) from the same platform ("stop_id" values). The above commit identifies these changes of service, and appropriately increments the aggregated number of transfers.


I now have to re-open this issue, because the gtfs_route() function then reveals another bug here. Setting max_transfers >= 4 should give the 4-transfer route, but instead returns this incorrect result:

library (gtfsrouter)
url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- "./feeds/HVV.zip"
if (!file.exists (f)) {
    download.file(url, file, quiet = FALSE)
}

from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- 8 * 3600
day_trip <- 3 # Tuesday

ham_gtfs <- extract_gtfs(file)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive  
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed 
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds 
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
ham_gtfs <- gtfs_transfer_table (ham_gtfs)
gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 4)
#>    route_name         trip_name                       stop_name arrival_time
#> 1          S1             Wedel     Hamburg Airport (Flughafen)     08:03:00
#> 2          S1             Wedel                        Ohlsdorf     08:06:00
#> 3          U1 Norderstedt Mitte                        Ohlsdorf     08:17:00
#> 4          U1 Norderstedt Mitte                   Klein Borstel     08:18:00
#> 5          U1 Norderstedt Mitte                     Fuhlsbüttel     08:19:00
#> 6          U1 Norderstedt Mitte                Fuhlsbüttel Nord     08:21:00
#> 7          U1 Norderstedt Mitte                Langenhorn Markt     08:23:00
#> 8         193        U Garstedt U Langenhorn Markt (Krohnstieg)     08:26:00
#> 9         193        U Garstedt                    Rittmerskamp     08:27:00
#> 10        193        U Garstedt                      Samlandweg     08:28:00
#> 11        193        U Garstedt                   Wrangelkoppel     08:29:00
#> 12        193        U Garstedt                      Ermlandweg     08:30:00
#> 13        191        Grothwisch                Krohnstiegtunnel     08:34:00
#> 14        191        Grothwisch                    Sperlingsweg     08:36:00
#> 15        191        Grothwisch                        Moorrand     08:37:00
#> 16        191        Grothwisch                   Grotkoppelweg     08:38:00
#> 17        191        Grothwisch                  Langobardenweg     08:39:00
#> 18       <NA>        (transfer)               S Hamburg Airport     00:44:00
#>    departure_time
#> 1        08:03:00
#> 2        08:09:00
#> 3        08:17:00
#> 4        08:18:00
#> 5        08:19:00
#> 6        08:21:00
#> 7        08:23:00
#> 8        08:26:00
#> 9        08:27:00
#> 10       08:28:00
#> 11       08:29:00
#> 12       08:30:00
#> 13       08:34:00
#> 14       08:36:00
#> 15       08:37:00
#> 16       08:38:00
#> 17       08:39:00
#> 18           <NA>

That seems to be just a final error with the returned order of the stops along the route.

@mpadge mpadge reopened this Aug 16, 2022
mpadge added a commit that referenced this issue Aug 16, 2022
mpadge added a commit that referenced this issue Aug 17, 2022
Those parameters aren't needed; the solution is to entirely prevent initial transfers
@mpadge mpadge closed this as completed in daab5e3 Aug 17, 2022
@mpadge
Copy link
Member

mpadge commented Aug 17, 2022

That last commit removes any transfers from being traced from the set of start stations, and so fixes the bug here. Running gtfs_traveltimes() still gives the same answer as above, with 2 transfers for 39 minutes, or 4 for 36, but the latter is no longer the incorrectly-timed version that started with a transfer. The gtfs_route() function still gives that, but that will be updated via #71, to allow full tracing of each connection listed in gtfs_traveltimes().

@szaboildi
Copy link
Author

Works as described, thank you!

@jh0ker
Copy link

jh0ker commented Oct 5, 2022

@mpadge As I mentioned here, I think I've found a wrong behavior in the gtfs_traveltimes function and it looks like it's part of this bug here. It's easily demonstrated with your example you posted before:

@szaboildi Sorry that it's taken so long to get back to this. The above commit fixes the bug you uncovered, and you should now see this:

library (gtfsrouter)
packageVersion ("gtfsrouter")
#> [1] '0.0.5.91'

url <- "http://daten.transparenz.hamburg.de/Dataport.HmbTG.ZS.Webservice.GetRessource100/GetRessource100.svc/90a29c65-d9c5-4281-a45a-e12bf19a18fb/Upload__HVV_Rohdaten_GTFS_Fpl_20210903.zip"
file <- "./feeds/HVV.zip"
if (!file.exists (file)) {
    download.file(url, file, quiet = FALSE)
|

from_trip <- "S Hamburg Airport"
to_trip <- "Langobardenweg"
start_time_trip <- 8 * 3600
day_trip <- 3 # Tuesday

ham_gtfs <- extract_gtfs(file)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive  
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed 
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds 
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
ham_gtfs <- gtfs_transfer_table (ham_gtfs)
route <- gtfs_route(ham_gtfs, from_trip, to_trip, start_time_trip, day_trip, max_transfers = 0)
# same as above: lists 2 transfers, and takes 39 minutes

ttimes <- gtfs_traveltimes(ham_gtfs, from_trip,
                           start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), 
                           day = day_trip, max_traveltime = 60 * 60)
ttimes[ttimes$stop_name == to_trip, ]
#>      start_time duration ntransfers                stop_id      stop_name
#> 1163   08:03:00 00:39:00          2 de:02000:91028::910093 Langobardenweg
#> 1168   08:03:00 00:36:00          4 de:02000:91028::910098 Langobardenweg
#>      stop_lon stop_lat
#> 1163 9.966597 53.63475
#> 1168 9.966927 53.63527

Created on 2022-08-16 by the reprex package (v2.0.1)

That now lists the correct number of transfers, along with an additional possibility to get there 3 minutes quicker with 4 transfers.

This result still looks correct when running it with the latest master version (0.0.5.123), however if you specify gtfs_traveltimes(..., minimise_transfers = TRUE), you can see the transfers are still being calculated wrong:

ttimes_mintransfer <- gtfs_traveltimes(ham_gtfs, from_trip,
                           start_time_limits = c(8 * 60 * 60, 9 * 60 * 60), 
                           day = day_trip, max_traveltime = 60 * 60, 
                           minimise_transfers = TRUE)
ttimes_mintransfer[ttimes_mintransfer$stop_name == to_trip, ]
#>      start_time duration ntransfers                stop_id      stop_name stop_lon stop_lat
#> 1353   08:33:00 00:39:00          0 de:02000:91028::910093 Langobardenweg 9.966597 53.63475
#> 1358   08:33:00 00:45:00          1 de:02000:91028::910098 Langobardenweg 9.966927 53.63527

Let me know if you need any more info or anything. Cheers!

@mpadge
Copy link
Member

mpadge commented Oct 5, 2022

Shall do @jh0ker, thanks! I recently also noticed odd behaviour when calling with minimise_transfers = TRUE.I'll re-open now and hopefully fix asap.

@mpadge mpadge reopened this Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants