COI preservation updater #9

pjrule · 2021-12-04T03:48:01Z

This PR adds a more polished version of the COI preservation calculations over in coi-states to the evaluation suite.

TODO

More tests
New variant (fractional scores)
Extra documentation for data sources?

Usage

The COI preservation updater assumes that COIs (or COI aggregations, a.k.a. geoclusters) and dual graph units can be (approximately) represented with a common block unit. Typically, this common unit is the 2020 U.S. Census block. Updaters are specialized to a particular set of COIs and a particular dual graph.

Example: preservation of Wisconsin geoclusters

Suppose wisconsin_clusters.csv is a table of geoclusters generated by the pipeline in coi-states. We can load 2020 Census block approximations of the geoclusters as follows:

import pandas as pd
from ast import literal_eval  

clusters_df = pd.read_csv('wisconsin_clusters.csv').set_index('id')
clusters_df['blocks_2020'] = clusters_df['blocks_2020'].apply(literal_eval)
coi_blocks = {coi: set(blocks) for coi, blocks in clusters_df['blocks_2020'].items()}

(To compute the preservation of individual COI submissions instead of geoclusters, simply swap in a submission-level dataset with a blocks_2020 column.)

An exact correspondence between 2020 Census blocks and 2020 Census VTDs can similarly be loaded from the official Census block assignment files (BAFs). We expect that the same node identifier is used in the dual graph and in the VTD-block calculations, so it may be necessary to map between Census GeoIDs and node indices.

from gerrychain import Graph
from collections import defaultdict

graph = Graph.from_json('wi_vtds_0_indexed.json')
geoid_to_node_index = {v: k for k, v in graph.nodes('GEOID20')}

vtd_block_df = pd.read_csv('BlockAssign_ST55_WI_VTD.txt', sep='|', dtype=str).set_index('BLOCKID')
vtd_block_df['vtd_id'] = '55' + vtd_block_df['COUNTYFP'].str.zfill(3) + vtd_block_df['DISTRICT'].str.zfill(6)
vtd_blocks = defaultdict(set)
for block, geoid in vtd_block_df['vtd_id'].items():
  vtd_blocks[geoid_to_node_index[geoid]].add(block)

Block total populations (P1_001N) can be retrieved via the Census API.

from census import Census

pl_client = Census(None).pl  # as of now, we can get away without a (free) Census API key
block_pop_df = pd.DataFrame(pl_client.get(['P1_001N'], {'for': 'block: *', 'in': 'state:55 county:*'}))
block_pop_df['GEOID20'] = (
  block_pop_df['state'].astype(str) +
  block_pop_df['county'].astype(str).str.zfill(3) + 
  block_pop_df['tract'].astype(str).str.zfill(6) + 
  block_pop_df['block'].astype(str).str.zfill(4)
)
block_pop_df = block_pop_df.set_index('GEOID20')
block_pops = dict(block_pop_df['P1_001N'])

Then, we can generate a COI preservation updater over a range of preservation thresholds:

from evaltools.evaluation import block_level_coi_preservation

score_fn_partial_dists = block_level_coi_preservation(
    unit_blocks=unit_blocks,
    coi_blocks=coi_blocks,
    block_pops=block_pops,
    thresholds=(0.75, 0.8, 0.85, 0.9, 0.95),
    partial_districts=True)

* Implements partial districts variant * Adds another test (for the case where COI size > district size) * Refines documentation

InnovativeInventor

Looks good to me. Not sure if the numpy stuff is entirely needed, though (pandas may be sufficient for this).

pjrule added 4 commits December 3, 2021 22:46

[WIP] COI preservation updater

44fb2e7

Refine COI preservation score

b8ed690

* Implements partial districts variant * Adds another test (for the case where COI size > district size) * Refines documentation

Tighten type annotations for unit_blocks and coi_blocks

ce58ac8

Update module for block_level_coi_processing

e378819

InnovativeInventor changed the title ~~[WIP] COI preservation updater~~ COI preservation updater Dec 10, 2021

InnovativeInventor self-requested a review December 10, 2021 20:50

Cleanup: replace .items() with .values()

6211b4e

InnovativeInventor approved these changes Dec 10, 2021

View reviewed changes

pizzimathy self-requested a review January 11, 2022 14:53

pizzimathy force-pushed the main branch from f641a13 to aa6512d Compare July 7, 2022 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COI preservation updater #9

COI preservation updater #9

pjrule commented Dec 4, 2021 •

edited

InnovativeInventor left a comment

COI preservation updater #9

Are you sure you want to change the base?

COI preservation updater #9

Conversation

pjrule commented Dec 4, 2021 • edited

TODO

Usage

Example: preservation of Wisconsin geoclusters

InnovativeInventor left a comment

Choose a reason for hiding this comment

pjrule commented Dec 4, 2021 •

edited