Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address autocompletion #12

Open
vkrause opened this issue Feb 15, 2024 · 13 comments
Open

Address autocompletion #12

vkrause opened this issue Feb 15, 2024 · 13 comments
Labels
enhancement New feature or request

Comments

@vkrause
Copy link
Member

vkrause commented Feb 15, 2024

By default this is handled by the address module of MOTIS. That however gets OOM-killed when importing the European OSM dataset.

In particular this happens here: https://github.com/motis-project/address-typeahead/blob/b6b5e60faac2921f1c0b3813da9c11731a3ca31d/src/extractor.cc#L404

NodeLocationsForWays's use of osmium::index::map::FlexMem shows up in Heaptrack as the main cost when testing with the CH subset, however still in "sparse" mode of FlexMem there (due to the dataset being smaller). When explicitly forcing "dense" mode (which is expected to be used for the EU dataset) even the CH subset runs out of memory (assuming I did that correctly).

Possible scenarios/approaches:

(1) Disable address autocompletion.

Limited loss of features in the web ui, but not ideal.

(2) Bug in FlexMem use of https://github.com/motis-project/address-typeahead

It's quite possible nobody has tried to load such a large OSM subset into this, and thus even realistic to fix/optimize issues have remained in that code path. Needs further investigation, but if that's the case would allow to enable this feature without requiring extra work or setup.

(3) External address autocompleter

Possible alternatives exist, such as Nominatim (OSM's default, supports incremental updates and proven to scale to the full planet dataset), or Photon (used by some Digitransit/OTP installations AFAIK). Integration into MOTIS would need work but doesn't appear too difficult.

@hbruch
Copy link

hbruch commented Feb 16, 2024

Regarding option 3): photon requires an existing Nominatim database to import from. In contrast to Nominatim, photon provides autocomplete functionality and some fuzzy matching. For stop searches, we usually augment the photon dataset by official stop registry data, as OSM stop information is usually incomplete.

@vkrause
Copy link
Member Author

vkrause commented Feb 16, 2024

Regarding option 3): photon requires an existing Nominatim database to import from. In contrast to Nominatim, photon provides autocomplete functionality and some fuzzy matching.

Oh, so it's both, not either/or. Can Photon update incrementally from an incrementally updated Nominatim?

For stop searches, we usually augment the photon dataset by official stop registry data, as OSM stop information is usually incomplete.

Stop searches are separate in MOTIS, my assumption would be those are solely based on the GTFS data. Basic testing on the demo instance shows that this also seems to work with different local languages (e.g. in Brussels). What doesn't seem to work though (and maybe that's asking for a bit much here) is foreign languages (ie. neither "Brüssel" nor "Cologne" work).

@SprickW
Copy link

SprickW commented Feb 17, 2024

Stop searches are separate in MOTIS, ...

Yes, by design. Any search engine for locations can be combined with MOTIS as MOTIS-GUI an API support coordinates (WGS84). In general: Any application can use MOTIS-routing as long as can express "from A to B" with coordinates.

@PartTimeDataScientist
Copy link
Contributor

(3) External address autocompleter

Possible alternatives exist, such as Nominatim (OSM's default, supports incremental updates and proven to scale to the full planet dataset), or Photon (used by some Digitransit/OTP installations AFAIK). Integration into MOTIS would need work but doesn't appear too difficult.

Pelias might also be a viable option. Although it is apparently no longer actively developed it seems to be stable enough for the commercial instance and seems to be actively used by Entur in Norway

https://github.com/entur/pelias-api
https://github.com/entur/kakka

Pelias has custom importers which could be used to import a list of stations using e.g. the CSV importer

@vkrause
Copy link
Member Author

vkrause commented Feb 24, 2024

Recommendation from people at the OSM Hack Weekend in Karlsruhe for this is Photon. That can do incremental updates, and there's a pre-built database for download (although supporting a limited amount of languages only, supported languages is an import-time setting). Full planet import is said to be doable on 64G RAM.

@PartTimeDataScientist
Copy link
Contributor

As far as I see it another major drawback with the prebuild indices is that they only contain stop information already included in OSM. I think it will be advantageous to be able to import the GTFS stop information into the geocoder at some point.

This than seems to requires a local Nominatim instance but I guess there's nothing wrong in starting with the out-of-the-box photon (indices) and switch to a customized Nominatim database extract later on...

@vkrause
Copy link
Member Author

vkrause commented Feb 24, 2024

Right, I'd also expect this will need a fully self-hosted setup eventually (for incremental updates and more languages), but the prebuilt index could be useful for evaluation/testing and work on integrating this with MOTIS.

@hbruch
Copy link

hbruch commented Feb 24, 2024

With the support from @lonvia, we augmented stadtnavi‘s photon with official stops in Baden-Württemberg, see https://github.com/stadtnavi/digitransit-ansible/blob/master/roles/photon/tasks/main.yml.

@vkrause
Copy link
Member Author

vkrause commented Feb 25, 2024

Yep, she mentioned various approaches and ideas on how we could fine-tune this for our usecase here yesterday as well. Might not be the most pressing issue right now, but I certainly like us having that option.

@vkrause
Copy link
Member Author

vkrause commented Mar 5, 2024

I've implemented a MOTIS <-> Photon bridge in https://github.com/vkrause/motis/tree/work/vkrause/photon. We don't gain much by this though, as this isn't used anywhere internally in MOTIS itself but only by its (deprecated) web UI.

For an actual deployment I'd rather suggest we bypass MOTIS for this entirely and expose the (more powerful) Photon API directly. However, this does demonstrate how we can integrate external services into MOTIS as a drop-in replacement for internal modules, ie. doing this for the OSM routing engines is probably much more interesting.

@jbruechert
Copy link
Collaborator

I think abstracting over the of the geocoder is actually pretty useful, since it allows us to switch the geocoder without breaking the API.

@vkrause
Copy link
Member Author

vkrause commented Mar 6, 2024

Right, but even then the Photon API looks like a better starting point than the current MOTIS one I'd say.

@PartTimeDataScientist
Copy link
Contributor

I think abstracting over the of the geocoder is actually pretty useful, since it allows us to switch the geocoder without breaking the API.

That's the very same thought that I had for the Mobidrom Routing Services. Thus I've started to build a simple REST proxy based on GeoPy. Repo is here (currently only supporting the public Photon instance from Komoot). Currently there's no live instance but a .devcontainer config and a Dockerfile - should be straightforward to play around with it. I plan to setup a live instance for testing in the next few days (depending on my other schedules...)

My goal is to have have a REST-interface which provides the main API-Endpoints that Photon (API-Documentation at Komoot) and Pelias (API-Doumentation Forward, API-Documentation Reverse) have to offer.

In this abstraction layer the response needs to different as not all geocoders which might be relevant upstream (Photon, Pelias, Nominatim, ...) deliver the same amount of information but similar to Photon I think the response should be valid GeoJSON and could include the raw response of the upstream geocoder as I've done for the PoC.

@derhuerst derhuerst added the enhancement New feature or request label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

6 participants