Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support incremental updates #49

Open
clarisma opened this issue Nov 11, 2022 · 1 comment
Open

Support incremental updates #49

clarisma opened this issue Nov 11, 2022 · 1 comment
Labels
roadmap Features scheduled for upcoming releases

Comments

@clarisma
Copy link
Owner

Currently, each GOL contains a snapshot of OSM data, imported from a specific .osm.pbf file.

In order to get an updated version, users need to download a new .osm.pbf file (or apply .osc files to an existing one), and build a new GOL from that file. Fortunately, build is relatively fast, but it still less than ideal if frequent updates are required. An ideal scenario would be the ability to update the GOL with just the OSM data that changed.

Here is the process we have in mind:

GOL files are already designed to be patchable. In theory, we could accommodate minutely updates, but our initial design is for a process that can run on a daily, or possibly hourly basis (The update-processing software would require greater complexity to do its job in 60 seconds).

Instead of directly applying .osc change files to the GOL, a separate program first pre-processes each .osc file into a set of patch files (one for each tile in which features have changed) and a version manifest (a small file that indicates which tiles have changed).

Ideally, this would run on a central server, so downstream users can skip the build import (which is comparatively resource-intensive) and just download tiles and patches.

For data consumers, the workflow would then look like this:

Instead of

build planet <osm-pbf-file>

you would use

load planet <tileset-url> [ -a=<area> | -b=<bbox> ]

to download the current version of the planet (or only the regions required). The compressed tiles are similar in size to data in osm-pbf format. The tiles are simply unzipped and written into the GOL, so this step happens essentially at download speed.

Using

update planet <tileset-url>

you can then periodically update your GOL to the most current state of the data. This should only take seconds if the GOL is already reasonably current (again, basically whatever time is required to download the patches).

Updating happens as a two-step process: First, the client-side engine downloads the version manifest (which indicates the tiles that changed) and marks those tiles in the user's GOL file as stale. It then downloads patch files that contain the changes and applies them to the tiles to bring them to the current version.

Updating can be eager (all stale tiles are immediately brought current by the update command) or lazy (patches are downloaded and applied to stale tiles only once they are actually required by a query).

@clarisma clarisma added the roadmap Features scheduled for upcoming releases label Nov 11, 2022
@clarisma
Copy link
Owner Author

clarisma commented Dec 1, 2022

The update command will support incremental updates directly from .osc files. The process will likely be sufficiently fast to process minute-by-minute updates, even on low-end systems. GOLs that are updatable in this manner will need to be built with additional indexes, increasing required storage by about 40%.

This form of updating will be able to generate patch files, which can be downloaded by data consumers that use "regular" GOLs (no additional indexes needed; updates in seconds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Features scheduled for upcoming releases
Projects
None yet
Development

No branches or pull requests

1 participant