Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return computed regions #176

Open
levyj opened this issue Jun 10, 2019 · 1 comment
Open

Return computed regions #176

levyj opened this issue Jun 10, 2019 · 1 comment

Comments

@levyj
Copy link
Member

levyj commented Jun 10, 2019

Socrata's Spatial Lens (most prominently used in map view filters) determines the geographic regions (e.g., Chicago Community Area) in which a point record falls. These regions are not presented in an easily understood way in either the grid view of a dataset or the /resource API but can be determined through a multi-step process. It would be great service if RSocrata could perform this process and return the calculated regions, if present.

As an example, see https://data.cityofchicago.org/Buildings/Building-Violations/22u3-xenr.

The /resource API shows an example record:

[
{
"id": "6274501",
"violation_last_modified_date": "2019-06-13T06:50:00.000",
"violation_date": "2019-06-13T00:00:00.000",
"violation_code": "CN193029",
"violation_status": "OPEN",
"violation_description": "WATCHMAN",
"violation_ordinance": "Maintain watchman from 4:00 PM to 8:00 AM for vacant and dangerous residential premises. (13-12-140)",
"inspector_id": "BL00943",
"inspection_number": "12953099",
"inspection_status": "CLOSED",
"inspection_waived": "N",
"inspection_category": "COMPLAINT",
"department_bureau": "DEMOLITION",
"address": "5301 S JUSTINE ST",
"street_number": "5301",
"street_direction": "S",
"street_name": "JUSTINE",
"street_type": "ST",
"property_group": "331440",
"latitude": "41.797602233",
"longitude": "-87.663320286",
"location": {
"latitude": "41.797602233217454",
"longitude": "-87.6633202858523",
"human_address": "{\"address\": \"\", \"city\": \"\", \"state\": \"\", \"zip\": \"\"}"
},
":@computed_region_vrxf_vc4k": "59",
":@computed_region_6mkv_f3dw": "14924",
":@computed_region_rpca_8um6": "37",
":@computed_region_bdys_3d7i": "790",
":@computed_region_43wa_7qmu": "2",
":@computed_region_awaf_s7ux": "19"
}
]

Note in particular:

":@computed_region_vrxf_vc4k": "59"

The /views API shows us under columns:

{
    "id" : 342479787,
    "name" : "Community Areas",
    "dataTypeName" : "number",
    "fieldName" : ":@computed_region_vrxf_vc4k",
    "position" : 31,
    "renderTypeName" : "number",
    "tableColumnId" : 60501607,
    "computationStrategy" : {
      "source_columns" : [ "location" ],
      "type" : "georegion_match_on_point",
      "parameters" : {
        "region" : "_vrxf-vc4k",
        "primary_key" : "_feature_id"
      }

So, that value is the Community Area but the value is not, as it might appear, Community Area 59. Instead, examine https://data.cityofchicago.org/dataset/Community-Areas/vrxf-vc4k. (Note the conversion of the underscore from the computed_region value to a hyphen.) The 59 refers to the record in this dataset with _feature_id 59, which turns out to be Community Area 61. (As I discovered in working through this example, the Feature IDs and Community Areas do match in many cases, which could lead people to think, incorrectly, that the :@computed_region_vrxf_vc4k is the Community Area number, itself.)

The final step is determining which column in this dataset shows the relevant value (Community Area, in this case). It should be fairly apparent to a person which column to use so, given the fairly small number of computed regions (types of regions, not the individual regions) likely used on a domain, it might be feasible to leverage that in some manner. However, there is an API. For the record, Socrata gave me the following warning, which I wish to record here:

Please note: Engineering emphasized that this is not an official API, so you are welcome to consult it but just know it's not officially supported as a source of truth for automated processes.

That said, if we consult https://data.cityofchicago.org/api/curated_regions and search for vrxf-vc4k, we see:

{
"id": 261,
"name": "Community Areas",
"createdAt": 1445869668,
"defaultFlag": true,
"enabledFlag": true,
"featurePk": "_feature_id",
"geometryLabel": "community",
"uid": "vrxf-vc4k",
"view": {
"id": "vrxf-vc4k",
"name": "Community Areas",
"averageRating": 0,
"createdAt": 1424310233,
"displayType": "table",
"downloadCount": 4,
"hideFromCatalog": true,
"hideFromDataJson": true,
"indexUpdatedAt": 1494641500,
"newBackend": true,
"numberOfComments": 0,
"oid": 10269628,
"provenance": "official",
"publicationAppendEnabled": false,
"publicationDate": 1424310240,
"publicationGroup": 2273835,
"publicationStage": "published",
"tableId": 2273835,
"totalTimesRated": 0,
"viewCount": 64,
"viewLastModified": 1494640995,
"viewType": "tabular",
"grants": [
{
"inherited": false,
"type": "viewer",
"flags": [
"public"
]
}

The item of interest is:

"geometryLabel": "community"

That is, in fact, the API field name from https://data.cityofchicago.org/dataset/Community-Areas/vrxf-vc4k indicating the Community Area, although it is worth noting that the value in this column for the above example is not 61 but NEW CITY, the name of the Community Area, rather than the number.

@levyj
Copy link
Member Author

levyj commented Dec 8, 2020

Just as a note, this issue is partly an invitation for someone to try to take on this feature request but at least as much a way to document the underlying structure. Especially as time passes, if anyone plans to rely on the information, whether to attempt the feature or for any other reason, it might be a good idea to confirm any critical portions with the RSocrata team and/or Socrata (https://github.com/socrata / https://dev.socrata.com/).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant