Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new tags to stylesheets #88

Closed
pnorman opened this issue Sep 28, 2013 · 7 comments
Closed

Adding new tags to stylesheets #88

pnorman opened this issue Sep 28, 2013 · 7 comments

Comments

@pnorman
Copy link
Collaborator

pnorman commented Sep 28, 2013

In some ways this is an issue with the stylesheets, but osm2pgsql is the common component and it'd be nice to solve this once.

Now that the stylesheets are being actively developed again, we're going to reach a point where we need to render something based on a new tag. There have been three openstreetmap-carto requests for new features with tags not in default.style (gravitystorm/openstreetmap-carto#111 gravitystorm/openstreetmap-carto#180 gravitystorm/openstreetmap-carto#81) and although none of those are certain to get in, it's a matter of time before we want to add a feature which relies on tags not in default.style. HDM-CartoCSS is already using default.style + hstore.

I believe it is desirable to use a common .style file among multiple stylesheets if at all possible.

I see four options for adding new tags. All of them would involve a reload of orm/yevaud because the required information simply isn't in their databases.

  1. Add new columns to default.style, requiring a database reload to get all the newly desired data in each time a tag is added. This has been the traditional way to add tags and is probably still possible for tile.osm.org with both orm and yevaud running although it would take care to avoid overloading them. For most servers it would involve downtime.

    It would also mean you need to be very careful about osm2pgsql/.style versions to make sure that you've got all the tag columns.

    With the database reloads on every tag addition and the chance of version hell, this is my least preferred option.

  2. Use the existing default.style with --hstore. This avoids the database reloads, but leaves us with an incoherent .style file. For example, default.style has generator:source which by any logic should be in hstore.

    This would work, but we'd be missing an opportunity to rationalize some of the legacy cruft in default.style.

  3. Use empty.style and a tags hstore column exclusively for OSM tags. This would be combined with suitable indexes, either on the entire tags column (GIN (tags)), partial indexes like GIST (way) WHERE tags ? 'amenity' OR tags ? 'shop', or a composite index like GIST (way, tags).

    This would require a reload and a rewrite of existing queries for efficiency. It would be possible to create a view that allows an existing stylesheet to be used without changes but this would end up running queries like WHERE (tags->'building') IS NOT NULL instead of WHERE NOT tags ? 'building'.

  4. Create a new style file with significant keys as columns and the remainder in hstore. This would put tags like natural, waterway, amenity, name, etc as columns and have the rest in hstore. As with right now, partial indexes would be necessary for performance. I expect all medium to medium-high queries would use have one of the tag columns but there are certain to be some that would not. Because of the differences between stylesheets having every tag used in a WHERE condition is a moving target.

I considered options involving --hstore-match-only but this would fail in the case of area:highway (gravitystorm/openstreetmap-carto#180) and abandoned:building (hotosm/HDM-CartoCSS#126).

At this point I think what is necessary is benchmarking of options 3 and 4 to see what kind of performance considerations there are.

cc @gravitystorm, @yohanboniface, @jburgess777

@apmon
Copy link
Contributor

apmon commented Sep 28, 2013

I would probably advocate a mixture of 2) and 4)

Reimport the database on orm/yevaud with the current style sheet and with --hstore-full. Then bit by bit move the carto style over to use the hstore column instead of normal columns where applicable. Once the normal column is no longer needed, one can drop it from a new style (osm_org.style) derived from the current default.style. With an alter table, one should be able to clean up orm/yevaud eventually as well.

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 30, 2013

I'd advocate mixing them, but the other way around

Reimport with new.style to a different prefix then use a view to make views equivalent to the current style.

@AndrewBuck
Copy link

I would advocate for option 4 as well. OSM basically has only 2 or 3 really common tags with, I think, building=* and highway=* being the really common ones, with maybe landuse, or source being the next most common.

After these few tags, the number will drop off quite quickly. So I would suggest having an actual column in the style file for the 5 or so most common tags (discounting discardable ones like tiger tags, etc) and then create an hstore column with everything else in it. I would then create indexes on the next 10 or so most common tags for the hstore column. I think that should give reasonable performance for a typical stylesheet, but for ones which use a lot of features not covered by these indices, they can then add their own indices on hstore as necessary.

-AndrewBuck

@AndrewBuck
Copy link

One other thing to consider is the issue of special tables for things like roads, buildings, etc. Right now we have a table just for roads and then a table for all other line features. Since we are re-designing the layout of the DB anyway, we could switch to a design using 4 or 5 "special" tables for very common features that tend to have only one "class" of tags on them, and then a final "catch all" table which has hstore columns to capture all the other objects.

The special tables would contain only objects who have some specific list of tags and no others on them and then any object with even a single tag on it that doesn't fit into one of these columns would get put in the "catch all" table with the hstore column. This would not only allow for very quick rendering of the most common object types (since they come from the special tables with dedicated columns), but it would also speed up queries on the "catch all" table, since after you remove buildings, roads, and landuse, from that table the catch all will end up being significantly smaller so whether or not indexes are employed, queries out of that table will end up being a lot faster anyway.

The disadvantage of this system is that a query to show a common object like a building would end up needing two queries, one to the special table to get the bulk of them, and then one to the "catch all" table to get the remaining handful that just happen to have an extra odd tag or two on them that forces them into the hstore table. This second query could be eliminated however by including all the objects that have any of the special tags on them in all of the special tables and then having an hstore column on each special table with the remaining tags for the straggler objects. This means only one table should ever need to be queried to get a particular feature type, but it will result in some objects appearing in multiple tables.

-AndrewBuck

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 30, 2013

Right now we have a table just for roads and then a table for all other line features.

No, the roads table is for low-zoom rendering, not for roads.

@woodpeck
Copy link
Contributor

"Since we are re-designing the layout of the DB anyway"? Hardly.

Personally I've been using a style file that has proper database columns for exactly those tags used in WHERE clauses, and everything else in hstore, plus matching views so that I can use the standard style sheets with zero changes. The performance hit is negligible.

@lonvia
Copy link
Collaborator

lonvia commented Nov 29, 2019

  1. is essentially implemented in openstreetmap-carto. Nothing required form osm2pgsql anymore.

@lonvia lonvia closed this as completed Nov 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants