#5325 Store and show way ids #5968

nnseva · 2021-02-22T19:38:05Z

Issue #5325

Features:
- ADDED: a new ways subsection in the annotations section containing way IDs along the found path, with sign meaning direction [Store way_id values for all edges #5325]
API:
- Fixed item type for nodes section of the annotations flatbuffers output format

include/engine/api/flatbuffers/route.fbs

include/engine/api/route_api.hpp

include/engine/guidance/assemble_geometry.hpp

akashihi · 2021-02-24T09:04:43Z

@nnseva Thank you for your contribution :) Let's wait for more opinions before merging.

I'm also curious, have you estimated, what is the dataset size change?

nnseva · 2021-02-24T13:14:06Z

@akashihi

I'm also curious, have you estimated, what is the dataset size change?

The unpacked dataset size should be increased to 8*N bytes, where N is a number of indexes in the segments file. The segments file will contain an additional section /osm_ways with the way IDs array corresponding to the segments array.

I didn't estimate the real packed dataset file size.

mjjbell

LGTM, some minor code comments.

Not sure if it matters for your use-case, but you'll currently get duplicates returned in the annotation if the way can't be fully compressed (e.g. one of the nodes is an intersection).

For example:

    Scenario: way annotations
        Given the query options
            | annotations | true |

        Given the node map
            """
            a--b--c
               |
               d
            """

        And the ways
            | nodes |
            | abc   |
            | bd    |


        When I route I should get
          | from | to | route    | 
          |    a |  c | abc,abc  |

The response will contain:

"ways": [ 5, 5 ],

Deduplication could be performed client-side as an alternative.

With regards to data size, you can get an idea of segment count by looking at the R-tree segs column from this spreadsheet, as it should be almost identical.

For some of the bigger map extracts, unpacked size will be:

Germany ~250MB
North America ~2.3 GB
Europe ~2.5 GB

I don't have numbers on hand for how much RAM is currently needed for routing on those extracts, it's probably adding a few %?

include/extractor/segment_data_container.hpp

src/extractor/compressed_edge_container.cpp

src/extractor/node_based_graph_factory.cpp

src/extractor/compressed_edge_container.cpp

mjjbell · 2021-02-24T23:56:51Z

Not sure if it matters for your use-case, but you'll currently get duplicates returned in the annotation if the way can't be fully compressed (e.g. one of the nodes is an intersection).

In fact, we'll get this even if the way is fully compressed. If a way contains N edges, we'll get that way returned N times in a row in the annotation.

nnseva · 2021-02-25T05:58:52Z

@mjjbell

"ways": [ 5, 5 ],

Sure, every pair of sequentially neighbor nodes will have a separate member in the ways array of the annotations:

    ...
    "nodes": [1, 2, 3],
    "ways": [5, 5]
    ...

(meaning that nodes 1, 2, 3 all in the way 5 and connected sequentially).

Such a duplication allows recognizing the path structure without additional analysis of the OSM structure. The same role plays an additional sign of the returned way ID - positive means that the path segment direction corresponds to the way direction, while negative means that the path direction is opposite to the way direction. I.e. the response for the opposite direction request with the same OSM data might look like:

   ...
   "nodes": [3, 2, 1],
   "ways": [-5, -5]
   ...

I should note also that the current code returns annotations always containing all nodes along a found path, without any compression.

I've only added way IDs (with direction sign) between every sequential node pairs in the returned node sequence.

jcoupey · 2021-02-25T09:41:10Z

I don't have numbers on hand for how much RAM is currently needed for routing on those extracts, it's probably adding a few %?

I just did an extract/contract pass on the latest germany pbf using v5.24.0 and the default car profile with excludables disabled. Then osrm-routed uses about 3.3G of RAM at startup so that sounds like a ~7.5% increase in RAM usage in that case.

Given the impact here and the fact that RAM consumption has always been a touchy subject (often a bottleneck when using OSRM), maybe we could further investigate the impact? I can report some figures on a bigger extract such as Europe when the preprocessing is done.

Maybe a silly question (I did not look up the implementation): would it be possible to make this change optional? There are definitely scenarios where this is super handy, but in other situations it would be just a pain to require extra RAM for an unused feature.

UPDATE 2: RAM consumption usage for osrm-routed on europe-210208 with the same setup is 26,2G for master and 29G for this PR, so it's more than a 10% increase in memory requirement. Quite significant.

mjjbell · 2021-02-26T20:25:15Z

UPDATE 2: RAM consumption usage for osrm-routed on europe-210208 with the same setup is 26,2G for master and 29G for this PR, so it's more than a 10% increase in memory requirement. Quite significant.

Given this, I'm inclined to agree that this should be stored as a separate file that can be optionally loaded.

nnseva · 2021-03-08T07:57:23Z

@jcoupey @mjjbell

UPDATE 2: RAM consumption usage for osrm-routed on europe-210208 with the same setup is 26,2G for master and 29G for this PR, so it's more than a 10% increase in memory requirement. Quite significant.

I should notice that the memory used for the additional segment data (Way IDs vector) is really included in mem-mapped storage (see https://github.com/Project-OSRM/osrm-backend/pull/5968/files#diff-ac92f7a34eb9d5b8b41e6a930db0f2e7356af695724842b390032f1e10842dc2R138). I'm not sure if we can rely on measuring memory consumption done this way.

eveyeti · 2021-03-14T01:33:57Z

Hi @nnseva ,

A try to create a docker image with your PR but when it compile and always have the same error.

c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[2]: *** [CMakeFiles/EXTRACTOR.dir/build.make:323: CMakeFiles/EXTRACTOR.dir/src/extractor/scripting_environment_lua.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:717: CMakeFiles/EXTRACTOR.dir/all] Error 2
make: *** [Makefile:152: all] Error 2

how to fix the compilation error?

CreatingDockerImage.log

Regards
Etienne

nnseva · 2021-03-15T08:50:10Z

Hi @eveyeti ,

c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.

I don't know exactly, but usually such a problem means not enough memory to compile. Try to increase memory and/or reduce number of processor cores available for the docker in your dockerfile.

Check your dockerfile against an OSRM master branch.

nnseva · 2021-03-17T09:21:59Z

@jcoupey @mjjbell

I've implemented skipping osm way ids in memory.

The extractor now can skip storing OSM Way IDs, the --skip-osm-ways optional parameter of the extractor turns the feature on.

If the OSM Way IDs section is skipped, the router will skip showing them in annotations.

The correspondent data section is not filled (the 0-size vector is created instead) and doesn't spend memory space.

frodrigo · 2021-03-17T09:30:17Z

Since this feature requires more memory than previous and not required in main use case. It is not preferable to have an option to enable it, rather than one to disable it?

nnseva · 2021-03-17T10:19:55Z

@frodrigo ,

Since this feature requires more memory than previous and not required in main use case. It is not preferable to have an option to enable it, rather than one to disable it?

Notice that the list of OSM Node IDs in the annotation is not required in the main use case also, but even more, is always on.
The annotation section list is determined by the enumeration values in the compile time. It would be not a desirable behavior if one of the sections will never appear by default.

mjjbell · 2021-03-19T14:40:41Z

I agree that this should be opt-in rather than opt-out (that was the initial proposal in #5325).
I expect most users will use the default processing setup, so we should optimise for that and give advanced users the option to customise.

Notice that the list of OSM Node IDs in the annotation is not required in the main use case also, but even more, is always on.

That's a fair point. However, there is interest in an alternative file loading mechanism that reduces memory footprint without a performance penalty (#5974, #5879, #5838). In which case, I can see OSM nodes also being an optionally loaded dataset in the not too distant future.

jcoupey · 2021-03-20T13:10:05Z

there is interest in an alternative file loading mechanism that reduces memory footprint without a performance penalty

That's a very interesting prospect indeed, which goes beyond the case of this PR. @mjjbell as you pointed out there are already a few tickets about various specific aspects of this question, maybe we should open a meta-ticket on the core principle of a new loading mechanism?

eveyeti · 2021-03-20T15:02:23Z

Hi @eveyeti ,

c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.

I don't know exactly, but usually such a problem means not enough memory to compile. Try to increase memory and/or reduce number of processor cores available for the docker in your dockerfile.

Check your dockerfile against an OSRM master branch.

Hi @nnseva thanks a lot, it works your advice!

Regards
Etienne.-

nnseva · 2021-03-24T07:07:40Z

@jcoupey

there is interest in an alternative file loading mechanism that reduces memory footprint without a performance penalty

That's a very interesting prospect indeed, which goes beyond the case of this PR. @mjjbell as you pointed out there are already a few tickets about various specific aspects of this question, maybe we should open a meta-ticket on the core principle of a new loading mechanism?

I would like to notice that there is some open future in requirements to annotate nodes and segments along the found route.

Looking for this future it might be appropriate to have a customizable annotations feature that might have an influence on memory consumption. This feature should allow customizing not only values (as it is now available) but keys of annotations.

Such a customizable annotation feature should allow customizing a set of available annotation keys, as well as some customizable procedures to get values for such annotations. Such a customizable procedure may be a scripting procedure calling on the extraction phase, as well as just a special open data file format associating source OSM IDs (for nodes) and OSM ID pairs (for segments) with annotated values.

Having such a feature, the deployment staff will have options to control memory consumption allowing only necessary annotation keys for the deployed OSRM instance.

nnseva · 2021-03-24T11:29:38Z

I agree that this should be opt-in rather than opt-out (that was the initial proposal in #5325).
I expect most users will use the default processing setup, so we should optimise for that and give advanced users the option to customise.

Please take your attention to that the annotation key set is defined on the compilation time like:

    enum class AnnotationsType
    {
        None = 0,
        Duration = 0x01,
        Nodes = 0x02,
        Distance = 0x04,
        Weight = 0x08,
        Datasources = 0x10,
        Speed = 0x20,
        Ways = 0x40,
        All = Duration | Nodes | Distance | Weight | Datasources | Speed | Ways
    };

The unsophisticated user using the default options values will probably expect that all available annotation keys including ways will be shown in case of all annotations are requested (annotations=true request parameter). Why the "ways" section (only!) is not appear? The optimization is not a good explanation for such a missing section from my point of view.

jcoupey · 2021-03-24T13:21:14Z

Why the "ways" section (only!) is not appear?

That would indeed be somehow inconsistent. The remarks about opt-in are rather a stand on a status quo position resulting from a concern on RAM requirements for standard usage. Of course you're right the same could (should?) apply to other datasets, but we have to deal with things as they are now and none of the rest is optional.

I think the core question here is whether we have the ability to add a generic mechanism to avoid loading a range of datasets. If this can be done realistically mid-term, we can probably merge this and live with the temporal increase until we provide opt-out mechanisms.

Probably others with more knowledge of the codebase and implications can comment on that last part?

mjjbell · 2021-03-24T23:41:44Z

The unsophisticated user using the default options values will probably expect that all available annotation keys including ways will be shown in case of all annotations are requested (annotations=true request parameter). Why the "ways" section (only!) is not appear? The optimization is not a good explanation for such a missing section from my point of view.

@nnseva I see your point. I was thinking it would it would have to return an error indicating the annotation was disabled. But if this means we're building a quasi optional data loader, then maybe it's better to design it properly to work for all optional datasets.

I think the core question here is whether we have the ability to add a generic mechanism to avoid loading a range of datasets. If this can be done realistically mid-term, we can probably merge this and live with the temporal increase until we provide opt-out mechanisms.

Probably others with more knowledge of the codebase and implications can comment on that last part?

There is an existing ticket (#5218) that outlines the steps required. It was a while ago, but @danpat maybe you have a sense for the effort required? I don't think the data loading code has changed much since then.

SamuelBrucksch · 2023-04-03T13:50:24Z

include/engine/api/route_api.hpp

@@ -494,7 +494,7 @@ class RouteAPI : public BaseAPI
                    return anno.datasource;
                });
        }
-        std::vector<uint32_t> nodes;
+        std::vector<uint64_t> nodes;


should this really be uint64 here? In Flatbuffers they are defined as uint32

Edit:
Have to correct myself, it was redefined to uint64, why?

nnseva added 7 commits February 10, 2021 11:34

Project-OSRM#5325 show way_ids in annotations - implementation

27420e6

Project-OSRM#5325 unit tests and fixes after tests

c2015a8

Project-OSRM#5325 Documentation changes and some fix for flatbuffers

d69d0c2

Project-OSRM#5325 CLang format

44bc0a5

Project-OSRM#5325 Return way ID with sign meaning direction

d946025

Project-OSRM#5325 changelog

4abfeaa

Project-OSRM#5325 Fixed CI problems

eb2eeb9

akashihi reviewed Feb 23, 2021

View reviewed changes

include/engine/api/flatbuffers/route.fbs Show resolved Hide resolved

akashihi reviewed Feb 23, 2021

View reviewed changes

include/engine/api/route_api.hpp Show resolved Hide resolved

akashihi reviewed Feb 23, 2021

View reviewed changes

include/engine/guidance/assemble_geometry.hpp Outdated Show resolved Hide resolved

akashihi reviewed Feb 23, 2021

View reviewed changes

include/engine/guidance/assemble_geometry.hpp Outdated Show resolved Hide resolved

Project-OSRM#5325 Fixed CI and review

b1cfc02

akashihi approved these changes Feb 24, 2021

View reviewed changes

mjjbell reviewed Feb 24, 2021

View reviewed changes

Project-OSRM#5325 Fixes after the revew

848e962

Project-OSRM#5325 Make osm_ids optional after the revew

129d954

Merge branch 'master' into 5325-store-and-show-way-ids

ca320f7

mjjbell mentioned this pull request Aug 2, 2021

where does osrm store nodes and edge data in the source code. #6091

Closed

SamuelBrucksch mentioned this pull request Mar 11, 2023

Determine speed limit for route #5286

Open

SamuelBrucksch reviewed Apr 3, 2023

View reviewed changes

tumik mentioned this pull request Mar 18, 2024

Truncated OSM node IDs on Raspberry Pi 4 with 64bit OS #6758

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#5325 Store and show way ids #5968

#5325 Store and show way ids #5968

nnseva commented Feb 22, 2021

akashihi commented Feb 24, 2021

nnseva commented Feb 24, 2021 •

edited

mjjbell left a comment

mjjbell commented Feb 24, 2021

nnseva commented Feb 25, 2021 •

edited

jcoupey commented Feb 25, 2021 •

edited

mjjbell commented Feb 26, 2021

nnseva commented Mar 8, 2021 •

edited

eveyeti commented Mar 14, 2021

nnseva commented Mar 15, 2021

nnseva commented Mar 17, 2021 •

edited

frodrigo commented Mar 17, 2021

nnseva commented Mar 17, 2021

mjjbell commented Mar 19, 2021

jcoupey commented Mar 20, 2021

eveyeti commented Mar 20, 2021

nnseva commented Mar 24, 2021 •

edited

nnseva commented Mar 24, 2021 •

edited

jcoupey commented Mar 24, 2021

mjjbell commented Mar 24, 2021

SamuelBrucksch Apr 3, 2023 •

edited

#5325 Store and show way ids #5968

Are you sure you want to change the base?

#5325 Store and show way ids #5968

Conversation

nnseva commented Feb 22, 2021

Issue #5325

akashihi commented Feb 24, 2021

nnseva commented Feb 24, 2021 • edited

mjjbell left a comment

Choose a reason for hiding this comment

mjjbell commented Feb 24, 2021

nnseva commented Feb 25, 2021 • edited

jcoupey commented Feb 25, 2021 • edited

mjjbell commented Feb 26, 2021

nnseva commented Mar 8, 2021 • edited

eveyeti commented Mar 14, 2021

nnseva commented Mar 15, 2021

nnseva commented Mar 17, 2021 • edited

frodrigo commented Mar 17, 2021

nnseva commented Mar 17, 2021

mjjbell commented Mar 19, 2021

jcoupey commented Mar 20, 2021

eveyeti commented Mar 20, 2021

nnseva commented Mar 24, 2021 • edited

nnseva commented Mar 24, 2021 • edited

jcoupey commented Mar 24, 2021

mjjbell commented Mar 24, 2021

SamuelBrucksch Apr 3, 2023 • edited

Choose a reason for hiding this comment

nnseva commented Feb 24, 2021 •

edited

nnseva commented Feb 25, 2021 •

edited

jcoupey commented Feb 25, 2021 •

edited

nnseva commented Mar 8, 2021 •

edited

nnseva commented Mar 17, 2021 •

edited

nnseva commented Mar 24, 2021 •

edited

nnseva commented Mar 24, 2021 •

edited

SamuelBrucksch Apr 3, 2023 •

edited