Skip to content

v0.2.47..v0.2.48 changeset PoiToPolygonConflation.asciidoc

Garret Voltz edited this page Sep 27, 2019 · 1 revision
diff --git a/docs/algorithms/PoiToPolygonConflation.asciidoc b/docs/algorithms/PoiToPolygonConflation.asciidoc
index 448ecb3..a7c0d15 100644
--- a/docs/algorithms/PoiToPolygonConflation.asciidoc
+++ b/docs/algorithms/PoiToPolygonConflation.asciidoc
@@ -13,32 +13,32 @@ any other tag which causes it to be classified as a POI or a building by the Hoo
 
 [[PoiToPolygonMatching]]
 === Matching
-  
+
 POI to Polygon conflation in Hootenanny is an additive, rule based conflation which follows these rules:
 
 * First, find all candidate POI / polygon pairs:
-** A candidate is any POI that meets the feature definition requirements and is within the combined circular error of a polygon plus a user 
+** A candidate is any POI that meets the feature definition requirements and is within the combined circular error of a polygon plus a user
 definable review distance (see below for the combined circular error calculation).
-** For certain feature types, Hootenanny defines a hardcoded custom review distance based on the type.  See 
+** For certain feature types, Hootenanny defines a hardcoded custom review distance based on the type.  See
 PoiPolygonDistance::getReviewDistanceForType.
-* **Match the two features:**  If the POI is within a user definable match distance AND either: 
+* **Match the two features:**  If the POI is within a user definable match distance AND either:
 ** the names of the two features are similar according to a configurable threshold OR
 ** the types of the two features are similar according to a configurable threshold OR
 ** the address of the two features are an exact match (standard OSM address tags are used; see PoiPolygonAddressScoreExtractor)
-** the phone number of the two features are a match (standard OSM and configurable phone number tags are used; see 
+** the phone number of the two features are a match (standard OSM and configurable phone number tags are used; see
 PoiPolygonPhoneNumberScoreExtractor)
-** If the Euclidean match distance requirement is not met, Hootenanny will also calculate the distance from the POI to a convex 
-polygon shape (alpha shape) derived from the polygon feature and use that value for the distance (certain restrictions 
+** If the Euclidean match distance requirement is not met, Hootenanny will also calculate the distance from the POI to a convex
+polygon shape (alpha shape) derived from the polygon feature and use that value for the distance (certain restrictions
 apply; see PoiPolygonMatch::_calculateEvidence).
-* **Force the two features to be manually reviewed:**  If the POI is within the review distance (accounting for circular error) and any 
+* **Force the two features to be manually reviewed:**  If the POI is within the review distance (accounting for circular error) and any
 one of the other previously listed criteria for a match is met.
 * **Do not match or review the two features:**  If the POI is not within the review distance of the polygon, regardless if any of the other
 match criteria are met.
 
-Unlike many of the matching routines intra-data set matches are allowed. This resolves issues that commonly occur in data sets where 
+Unlike many of the matching routines intra-data set matches are allowed. This resolves issues that commonly occur in data sets where
 polygons are duplicated in a POI layer.
 
-The circular error (CE) of the two input elements is assumed to be that we are 95% CE for each feature (2 sigma). To combine the two 
+The circular error (CE) of the two input elements is assumed to be that we are 95% CE for each feature (2 sigma). To combine the two
 values together into a single value that represents the 95% confidence that they're within that distance is:
 
 ------
@@ -59,44 +59,46 @@ Techniques that were experimented with but proved to add no benefit to the model
 [[PoiToPolygonMerging]]
 === Merging
 
-The first layer selected is the reference layer and the second, the secondary layer, same as in all other hoot conflation types.  The 
-reference layer gets priority on the tags that are kept.  Geometry merging is a little more complex as later described.
-  
-Once a relationship has been established between elements a graph is created to determine any interdependencies. If a single element is 
-involved in multiple matches then all the elements involved in that set of matches are marked as needing review. This avoids complex 
+The first layer selected is the reference layer and the second, the secondary layer, same as in all other hoot conflation types.  Which tags
+are kept is dependent on the selected tag merging strategy as described later.  Geometry merging is a little more complex as later described.
+
+Once a relationship has been established between elements a graph is created to determine any interdependencies. If a single element is
+involved in multiple matches then all the elements involved in that set of matches are marked as needing review. This avoids complex
 situations where there are multiple conflicting POI attributes.
 
-However, if a review relationship is found and a match relationship is found, the review relationship is not included in the 
-interdependence calculation. So, you may have a POI merged with one building, but marked as needing review with another building. 
+However, if a review relationship is found and a match relationship is found, the review relationship is not included in the
+interdependence calculation. So, you may have a POI merged with one building, but marked as needing review with another building.
 Modifying this business logic will require some user input on the desired functionality as well as some not so insignificant internal changes.
 
-If a merge is warranted, the geometry of the building is used and the tags are merged using the default tag merging mechanism 
-(+tag.merger.default+ configuration key).
+If a merge is warranted, the geometry of the building is used and the tags are merged using the tag merging mechanism defined in the
++poi.polygon.tag.merger+ configuration key. If that option is not defined, then the value from the +tag.merger.default+ configuration option
+is used. If the +poi.polygon.auto.merge.many.poi.to.one.poly.matches+ option is enabled, then all many POI to single polygon matches are always
+merged with hoot::PreserveTypesTagMerger, which retains type tags for merge features.
 
 Detailed Merging Workflow:
 
-* Merge the tags of all matching POIs in the reference layer together with each other if there is more than one POI in the reference 
+* Merge the tags of all matching POIs in the reference layer together with each other if there is more than one POI in the reference
 layer to merge
-* Merge the tags of all matching POIs in the secondary layer together with each other if there is more than one POI in the secondary 
+* Merge the tags of all matching POIs in the secondary layer together with each other if there is more than one POI in the secondary
 layer to merge
-* Merge the building tags for matching buildings from both layers together as described in 
+* Merge the building tags for matching buildings from both layers together as described in
 https://github.com/ngageoint/hootenanny/files/595244/Hootenanny.-.Building.Conflation.2014-08-19.pptx slide 6; they’re averged together
-* Merge the building geometries for matching buildings from both layers together as described in that slide; pick the most complex 
+* Merge the building geometries for matching buildings from both layers together as described in that slide; pick the most complex
 building geometry and if both are the same complexity then pick the first geometry
-* Merge the tags of the matching POIs and buildings from both layers with each other; all first layer reference tags take priority 
+* Merge the tags of the matching POIs and buildings from both layers with each other; all first layer reference tags take priority
 over secondary layer tags
 * Remove the POI geometries as they’ve been “merged” with the building geometries.
 
 [[PoiToPolygonConfigurableOptions]]
 === Configurable Options
-  
+
 See the User Guide Command Line Documentation section for all configuration options beginning with the text "poi.polygon".
 
 [[PoiToPolygonTestResults]]
 === Test Results
 
-Match truth for several datasets was obtained by having a human manual match features 
-(see https://github.com/ngageoint/hootenanny/files/595245/Hootenanny.-.Manual.Matching.9-13-16.pptx for more details on the process 
+Match truth for several datasets was obtained by having a human manual match features
+(see https://github.com/ngageoint/hootenanny/files/595245/Hootenanny.-.Manual.Matching.9-13-16.pptx for more details on the process
 involved).  Then, Hootenanny conflated the same data and scored how many matches it correctly made.
 
 .POI to Polygon Test Data Sources
@@ -106,7 +108,7 @@ involved).  Then, Hootenanny conflated the same data and scored how many matches
 | 1 | KisMaayo, Somalia | MGCP | UTP | good | good | poor | none | none
 | 2 | KisMaayo, Somalia | MGCP | OSM | good | good | poor in MGCP; average in OSM | none | none
 | 3 | San Francisco, USA | OSM | city govt | good | average | average | poor in OSM; none in city govt | average
-| 4 | Munich, Germany | OSM | NAVTEQ | good in OSM; poor near intersections for NAVTEQ | average for OSM; good for NAVTEQ | good | average | average  
+| 4 | Munich, Germany | OSM | NAVTEQ | good in OSM; poor near intersections for NAVTEQ | average for OSM; good for NAVTEQ | good | average | average
 | 5 | Cairo, Egypt | N/A | N/A | good for poly; average for POIs | good | good | none | average
 | 6 | Alexandria, Egypt | N/A | N/A | good for poly; average for POIs | good | good | none | poor
 | 7 | Rafah, Syria | N/A | N/A | good | good | poor for polys; good for POIs | none | none
@@ -119,7 +121,7 @@ involved).  Then, Hootenanny conflated the same data and scored how many matches
 | 1 | 58 | 14.8% | 84.2% | 1.0% | 5.367 | **99.0%**
 | 2 | 13 | 38.8% | 55.6% | 5.6% | 1.43 | **94.4%**
 | 3 | 989 | 21.7% | 70.7% | 7.6% | 1.20 | **92.4%**
-| 4 | 386 | 2.8% | 94.3% | 2.9%| 33.0 | **97.6%** 
+| 4 | 386 | 2.8% | 94.3% | 2.9%| 33.0 | **97.6%**
 | 5 | 56 | 61.8% | 33.3% | 4.9% | 0.54 | **95.1%**
 | 6 | 6 | 66.7% | 0.0% | 33.3% | 0.0 | **66.7%**
 | 7 | 5 | 100.0% | 0.0% | 0.0% | 0.2 | **100.0%**
@@ -134,6 +136,6 @@ Combined Correct = number of correct matches + number of unnecessary reviews
 * more intelligent POI merging
 * model based classification
 
-For more information on POI to polygon conflation: 
+For more information on POI to polygon conflation:
 https://github.com/ngageoint/hootenanny/files/607197/Hootenanny.-.POI.to.Polygon.2016-11-15.pptx
 
Clone this wiki locally