add DOF Property Charges Balance #290

austensen · 2023-12-10T23:33:15Z

Dept of Finance dataset with how much property's owe to the city. Can be helpful in identifying building under financial distress.

Dataset: https://data.cityofnewyork.us/City-Government/DOF-Property-Charges-Balance/scjx-j6np

dataset/table name: dof_property_charges

add dataset to NYCDB
add documentation to wiki
add example query to wiki

Each task can be completed by a different person - comment below to claim a part of it

The text was updated successfully, but these errors were encountered:

wstlabs · 2024-01-20T17:41:39Z

Getting started

wstlabs · 2024-01-20T19:19:31Z

I've pushed some (very) rough and ready code to the following branch:

https://github.com/nycdb/nycdb/tree/dev-291-dataset-dof-property-charge

That has been tested (successfully) on a partial load of 600k or so records (1 percent of the total of 62M, still downloading).

However not ready for others to test, due to some apparent underlying weirdness in the existing codebase - which I'd like to run by @austensen (or someone else) before going into much detail here just yet.

As to the weirdness - has to do the forced CamelCase munging of column names (which apparently has unintended side effects). Should be easy enough to resolve (sometime in the coming days, after the hackathon)

kfinn · 2024-01-20T19:45:30Z

Hi @wstlabs ! Do you mind clarifying exactly what you mena about the forced CamelCase munging of column names?

Alternatively, there's some documentation on the column name munging we do: https://github.com/nycdb/nycdb/blob/main/src/ADDING_NEW_DATASETS.md#-note- (see the bulleted list "Some examples of how column names are transformed:"), I wonder if this would add enough context to answer your questions.

wstlabs · 2024-01-20T20:10:57Z

Basically, the CC munging seems to conflict with the (what would seem to be more important) explicit field declarations in the dataset config file (src/nycdb/datasets/dof_property_charge.yml in the new branch).

At least my assumption was that the config provides the explicit schema. In presenting an explicit mapping of column names to types -- that definitely would seem to be its purpose. But no, it seems that's not the "real" schema that ends up being used -- or perhaps it is, in terms of column types, but not column names. Which are still automunged internally, per the above description.

Here's how it plays out in this case:

(1) The raw file contains some field names with underscores, e.g. dt_pd_begin which (as per the writeup) nycdb is apparently trying to munge to DtPdBegin

(2) Which apparently overrides the settings in the config file (src/nycdb/datasets/dof_property_charge.yml), contrary to expectations.

(3) So you'd think "Fine, I'll bring the config file in line with the automunged name then, to make everyone happy". But unfortunatelly, no -- it also apparently wants the field names in the CSV header to matched the automunged names as well (meaning I had to edit the CSV, and change underscored names to CamelCase throughout) -- in order to get the file to load.

Which is not the way things are meant to be done, I'm assuming.

But at least the file (or a 1 percent sample of it) does load, with close to correct column types -- which is a good sign, in that it seems it should be pretty easy to get this dataset integrated (once the above weirdness is resolved).

austensen added new dataset New dataset to add to NYCDB documentation Add documentation to wiki example query Add example query to wiki labels Dec 10, 2023

austensen added this to the HDC Hackathon milestone Dec 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add DOF Property Charges Balance #290

add DOF Property Charges Balance #290

austensen commented Dec 10, 2023

wstlabs commented Jan 20, 2024

wstlabs commented Jan 20, 2024 •

edited

kfinn commented Jan 20, 2024

wstlabs commented Jan 20, 2024

add DOF Property Charges Balance #290

add DOF Property Charges Balance #290

Comments

austensen commented Dec 10, 2023

wstlabs commented Jan 20, 2024

wstlabs commented Jan 20, 2024 • edited

kfinn commented Jan 20, 2024

wstlabs commented Jan 20, 2024

wstlabs commented Jan 20, 2024 •

edited