Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selected-link analysis for assignment #913

Open
abyrd opened this issue Nov 29, 2023 · 1 comment · May be fixed by #924
Open

Selected-link analysis for assignment #913

abyrd opened this issue Nov 29, 2023 · 1 comment · May be fixed by #924

Comments

@abyrd
Copy link
Member

abyrd commented Nov 29, 2023

A customer has described a process for using R5 for what is typically called "selected link analysis" in traffic models, in ways related to the assignment phase of such models.

The process is currently somewhat convoluted, as R5's internal information about paths between specific origins and destinations is not broken out by link. One can export paths between a set of origins and destinations, but there is one full end-to-end path on each line of CSV. These full paths must be post-processed to find relevant links. Additional workarounds are necessary to pre-select origin-destination pairs for which some paths pass through the links of interest.

The customer does not mind scaling origin-destination flows to reflect future scenarios, then assigning those differences proportionately to links in manual post-processing. But to do this assignment step efficiently, they need some indication of how flows are spread over links within the geographic areas of interest. So the simplest change enabling this kind of analysis is to break out the link-level information on separate CSV lines.

Doing this for all origins, destinations, and links would yield O*D*L rows of CSV which gets very large very fast. In addition, for most use cases that output would need to be immediately filtered down to a small number of links and a subset of origin-destination pairs. The intermediate huge table is not needed, and the filtering would introduce extra manual steps. Instead, all filtering could be performed in a streaming manner within R5 itself.

Selecting a small set of links by unique ID is not trivial, because network modifications in various scenarios can introduce new links along a road of interest. So it is expected that selection by geographic bounding box/polygon will give a smoother workflow over multiple scenarios.

The customer does not mind summing or otherwise manipulating final values for a number of different links, as long as they have been filtered down to only the links in the area of interest and are well labeled. If multiple bus routes from GTFS and scenarios all pass along a particular road, it is acceptable to report them separately as just routes, without attempting to use heuristics to sum traffic along each distinct road segment. So the proposed solution is to create CSV output where lines are (origin, destination, route_name, proportion). Note that some proportion of the "iterations" for a particular origin-destination pair may pass outside the selected area, so if the proportion field is not a raw count, they should be proportions out of the total number of iterations per OD, not out of the number of iterations passing through this area. The lines of output may be filtered down to only the origin-destination pairs where some proportion of the paths pass through the selected area, with the understanding that all other pairs have values of zero in the proportion column.

One problem that arose in the current roundabout implementation of this selected link analysis is that routes introduced by scenario modifications are identified in the CSV output with a random UUID, and although the user-specified name of the modification is assigned to a field of the generated route object, it is currently not possible to get that name into the output CSV without patching the source code (see https://github.com/conveyal/r5/tree/path-route-names). Any new output CSV should probably include the route short name when identifying network links, so that the names of modifications are visible in the output instead of random UUIDs.

@abyrd
Copy link
Member Author

abyrd commented Dec 12, 2023

If the paths between any two origins are to be allocated to different transit routes, the categorization needs to be mutually exclusive. Otherwise the proportions for those paths, together with any paths passing outside the area of interest, will sum to more than 100% and the figures will not be useful for scaling existing OD flows. It also becomes impossible to infer which proportion of trips do not pass through the selected area.

For example, if a transfer between two buses happens inside the selection polygon, that particular path must not count twice, once for each bus route it uses that's inside the polygon (unless it counts 1/N for each of those N routes).

The question then arises whether we need the proportions broken down by route at all. The real goal here is to identify which proportion of traffic between each O-D pair passes through a given street to think about physical congestion of road infrastructure. It might make more sense for polygons to just enclose a single street or street segment, and for each OD pair, sum all itineraries on any route that pass through the polygon. The output would be like this:

origin_id, destination_id, polygon_id, proportion
1, 2, A, 0.123
1, 2, B, 0.200
1, 8, A, 0.304
1, 8, B, 0.106

So for every OD pair where at least one trip passes through a polygon, you get rows for each polygon the trips pass through and the proportions. These proportions multiplied by the existing flows for those ODs should be the increases in passenger flows on that street.

It may not matter whether these are broken down by transit line at all.

In any case, any breakdown should in any case be done in such a way as to ensure the total sums to 100%.

abyrd added a commit that referenced this issue Jan 3, 2024
this can be used as an experimental worker version
with a custom modification of r5type select-link
adresses #913
@abyrd abyrd linked a pull request Jan 3, 2024 that will close this issue
abyrd added a commit that referenced this issue Jan 11, 2024
this can be used as an experimental worker version
with a custom modification of r5type select-link
adresses #913
@abyrd abyrd linked a pull request Jan 12, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant