Skip to content

v0.2.47..v0.2.48 changeset HootenannyImplicitTagDatabaseGeneration.asciidoc

Garret Voltz edited this page Sep 27, 2019 · 1 revision
diff --git a/docs/developer/HootenannyImplicitTagDatabaseGeneration.asciidoc b/docs/developer/HootenannyImplicitTagDatabaseGeneration.asciidoc
index 137e254..10046e2 100644
--- a/docs/developer/HootenannyImplicitTagDatabaseGeneration.asciidoc
+++ b/docs/developer/HootenannyImplicitTagDatabaseGeneration.asciidoc
@@ -2,29 +2,29 @@
 [[ImplicitTagRulesDatabaseGeneration]]
 == Implicit Tag Rules Database Generation
 
-Hootenanny has the capability to derive type tag information automatically for elements based soley on the contents of their name tags.  
-To do this tagging Hootenanny first parses global data, currently OSM and GeoNames, and generates a tag rules database.  The configured 
-database is set with the implicit.tagger.rules.database configuration option.  Periodically, the rules database may be re-generated 
-with the goal of improving the tagging performance.  This document outlines the steps to regenerate the database.  See the associated 
-implicit tagging command line documentation for additional details.  Furthermore, familiarize yourself with the implicit tagging 
+Hootenanny has the capability to derive type tag information automatically for elements based soley on the contents of their name tags.
+To do this tagging Hootenanny first parses global data, currently OSM and GeoNames, and generates a tag rules database.  The configured
+database is set with the implicit.tagger.rules.database configuration option.  Periodically, the rules database may be re-generated
+with the goal of improving the tagging performance.  This document outlines the steps to regenerate the database.  See the associated
+implicit tagging command line documentation for additional details.  Furthermore, familiarize yourself with the implicit tagging
 configuration options by reading their descriptions (configuration options beginning with implicit.tagging.* and implicit.tagger.*).
 
 === Acquire Input Data
 
-Download the latest OSM Planet (.osm.pbf) and GeoNames allCountries (.geonames) files.  Additional datasets can be also be used as inputs, 
+Download the latest OSM Planet (.osm.pbf) and GeoNames allCountries (.geonames) files.  Additional datasets can be also be used as inputs,
 if desired.
 
 === Input Pre-filtering (optional)
 
-If you will be repeatedly generating implicit tag raw rules files as part of raw rules generation code changes or rule experimentation, 
-you should pre-filter your input datasets to data that is eligible to have implicit tags derived from it to save future processing time.  
+If you will be repeatedly generating implicit tag raw rules files as part of raw rules generation code changes or rule experimentation,
+you should pre-filter your input datasets to data that is eligible to have implicit tags derived from it to save future processing time.
 The pre-filtered input datasets must be translated if in a format other than OSM.  A pre-filtering example:
 
 ---------------------------
 hoot convert -D convert.ops="hoot::TranslationVisitor;hoot::ImplicitTagEligiblePoiPolyCriterion" -D schema.translation.script="translations/GeoNames.js" allCountries.geonames allCountries-filtered.osm.pbf
 
 hoot convert -D convert.ops="hoot::ImplicitTagEligiblePoiPolyCriterion" planet.osm.pbf planet-filtered.osm.pbf
-----------------------------
+---------------------------
 
 === Raw Rules Derivation
 
@@ -33,15 +33,15 @@ Next, you need to generate a raw rules flat file from the input data.  Here is a
 ---------------------------
 hoot type-tagger-rules --create-raw "planet.osm.pbf;allCountries.geonames" "none;translations/GeoNames.js" \
 osm-geonames.implicitTagRules
-----------------------------
+---------------------------
 
-If you already performed the optional pre-filtering and translation from the previous section, you can use this example to save processing 
+If you already performed the optional pre-filtering and translation from the previous section, you can use this example to save processing
 time which bypasses filtering and translation:
 
 ---------------------------
 hoot type-tagger-rules --create-raw -D implicit.tagging.raw.rules.deriver.skip.filtering=true \
 "planet-filtered.osm.pbf;allCountries-filtered.geonames" "none;none" osm-geonames-poi.implicitTagRules
-----------------------------
+---------------------------
 
 === Rules Database Derivation
 
@@ -49,17 +49,17 @@ Finally, you need to generate the implicit tag rules database.  An example:
 
 ---------------------------
 hoot type-tagger-rules --create-db osm-geonames-poi.implicitTagRules osm-geonames-poi-rules.sqlite
----------------------------------
+---------------------------
 
 You can check the results of a generated database with:
 
 -------------------------
 hoot type-tagger-rules --db-stats osm-geonames-poi.implicitTagRules
-------------------------
+-------------------------
 
-Also, you can examine the output rules database in a Sqlite database viewer, such as DB Browser for SQLite. 
+Also, you can examine the output rules database in a Sqlite database viewer, such as DB Browser for SQLite.
 
-Once rules have been generated, database derivation is then fairly quick, so you can tweak configuration parameters and regenerate 
+Once rules have been generated, database derivation is then fairly quick, so you can tweak configuration parameters and regenerate
 databases many times with ease over the course of trying to improve a set of implicit tag rules.
 
 === Customizing Database Generation Behavior
@@ -70,14 +70,14 @@ The following configuration options point to files which may be used to add cust
 - implicit.tagging.database.deriver.word.ignore.file - allows for ignore words (name tokens)
 - implicit.tagging.database.deriver.custom.rule.file - allows for adding/overriding tag rules
 
-You may need to modify the aforementioned files and then regenerate the rules database in order to achieve the desired tagging performance.  
+You may need to modify the aforementioned files and then regenerate the rules database in order to achieve the desired tagging performance.
 There are additional customization options available.  See the configuration options "implicit.tagging.*" and "implicit.tagger.*".
 
 === Tagger Influence on Conflation Performance
 
-At the time of this writing, the only tests currently set up to measure implicit feature tagger performance are those that are part 
-of the Global POI conflation multiary and Unifying POI/Polygon regression tests.  The multiary tests can be found in the global-pois 
-branch of the hoot-tests repository (hoot-tests/unifying-tests.child/model-training.child/train-multiary-poi.child) and the POI/Polygon 
+At the time of this writing, the only tests currently set up to measure implicit feature tagger performance are those that are part
+of the Global POI conflation multiary and Unifying POI/Polygon regression tests.  The multiary tests can be found in the global-pois
+branch of the hoot-tests repository (hoot-tests/unifying-tests.child/model-training.child/train-multiary-poi.child) and the POI/Polygon
 tests can be found in the hoot-tests master branch.
 
 To implicitly tag other data, see the "Implicit Element Type Tagging Based On Name" section of the Hootenanny User Guide.
Clone this wiki locally