Skip to content

Commit

Permalink
turn supervision_officer_attribute_sessions into supervision_staff_at…
Browse files Browse the repository at this point in the history
…tribute_sessions (Recidiviz/recidiviz-data#29335)

## Description of the change

Renamed the file and view, and added WHERE clauses to downstream views
to do the `role_type` filtering that
`supervision_officer_attribute_sessions` was doing. This allows us to
more easily query staff information for users who don't have an ingested
role_type (for example, getting information for specialists who are not
technically officers into the workflows supervision_staff_record).

Will post the sandbox prefix once it's done loading!

## Type of change

> All pull requests must have at least one of the following labels
applied (otherwise the PR will fail):

| Label | Description |
|-----------------------------
|-----------------------------------------------------------------------------------------------------------
|
| Type: Bug | non-breaking change that fixes an issue |
| Type: Feature | non-breaking change that adds functionality |
| Type: Breaking Change | fix or feature that would cause existing
functionality to not work as expected |
| Type: Non-breaking refactor | change addresses some tech debt item or
prepares for a later change, but does not change functionality |
| Type: Configuration Change | adjusts configuration to achieve some end
related to functionality, development, performance, or security |
| Type: Dependency Upgrade | upgrades a project dependency - these
changes are not included in release notes |

## Related issues

Closes Recidiviz/recidiviz-data#29066

## Checklists

### Development

**This box MUST be checked by the submitter prior to merging**:
- [x] **Double- and triple-checked that there is no Personally
Identifiable Information (PII) being mistakenly added in this pull
request**

These boxes should be checked by the submitter prior to merging:
- [ ] Tests have been written to cover the code changed/added as part of
this pull request

### Code review

These boxes should be checked by reviewers prior to merging:

- [ ] This pull request has a descriptive title and information useful
to a reviewer
- [ ] Potential security implications or infrastructural changes have
been considered, if relevant

GitOrigin-RevId: 0fd8c4ae7e5b19e95705af59e7ac9f57393a81cf
  • Loading branch information
danawillow authored and Helper Bot committed May 11, 2024
1 parent 422d774 commit 48ecca1
Show file tree
Hide file tree
Showing 7 changed files with 41 additions and 32 deletions.
5 changes: 3 additions & 2 deletions recidiviz/aggregated_metrics/misc_aggregated_metrics.py
Expand Up @@ -145,11 +145,12 @@ def _query_template_and_format_args(
FROM
`{{project_id}}.aggregated_metrics.supervision_officer_aggregated_metrics_materialized` a
INNER JOIN
`{{project_id}}.sessions.supervision_officer_attribute_sessions_materialized` b
`{{project_id}}.sessions.supervision_staff_attribute_sessions_materialized` b
ON
a.state_code = b.state_code
AND a.officer_id = b.officer_id
AND a.end_date BETWEEN b.start_date AND {nonnull_end_date_exclusive_clause("b.end_date_exclusive")},
AND a.end_date BETWEEN b.start_date AND {nonnull_end_date_exclusive_clause("b.end_date_exclusive")}
AND "SUPERVISION_OFFICER" IN UNNEST(b.role_type_array),
UNNEST(supervisor_staff_id_array) AS supervisor_staff_id
)
Expand Down
Expand Up @@ -243,7 +243,10 @@ def get_index_columns_query_string(self, prefix: Optional[str] = None) -> str:
(
MetricUnitOfObservationType.SUPERVISION_OFFICER,
MetricUnitOfAnalysisType.SUPERVISION_OFFICER,
): """SELECT * FROM `{project_id}.sessions.supervision_officer_attribute_sessions_materialized`""",
): """
SELECT * FROM `{project_id}.sessions.supervision_staff_attribute_sessions_materialized`
WHERE "SUPERVISION_OFFICER" IN UNNEST(role_type_array)
""",
(
MetricUnitOfObservationType.SUPERVISION_OFFICER,
MetricUnitOfAnalysisType.SUPERVISION_OFFICE,
Expand All @@ -252,14 +255,17 @@ def get_index_columns_query_string(self, prefix: Optional[str] = None) -> str:
supervision_office_id AS office,
supervision_district_id AS district,
FROM
`{project_id}.sessions.supervision_officer_attribute_sessions_materialized`""",
`{project_id}.sessions.supervision_staff_attribute_sessions_materialized`
WHERE "SUPERVISION_OFFICER" IN UNNEST(role_type_array)
""",
(
MetricUnitOfObservationType.SUPERVISION_OFFICER,
MetricUnitOfAnalysisType.SUPERVISION_DISTRICT,
): """SELECT
*, supervision_district_id AS district,
FROM
`{project_id}.sessions.supervision_officer_attribute_sessions_materialized`
`{project_id}.sessions.supervision_staff_attribute_sessions_materialized`
WHERE "SUPERVISION_OFFICER" IN UNNEST(role_type_array)
""",
(
MetricUnitOfObservationType.SUPERVISION_OFFICER,
Expand All @@ -268,13 +274,17 @@ def get_index_columns_query_string(self, prefix: Optional[str] = None) -> str:
*,
supervisor_staff_id AS unit_supervisor,
FROM
`{project_id}.sessions.supervision_officer_attribute_sessions_materialized`,
`{project_id}.sessions.supervision_staff_attribute_sessions_materialized`,
UNNEST(supervisor_staff_id_array) AS supervisor_staff_id
WHERE "SUPERVISION_OFFICER" IN UNNEST(role_type_array)
""",
(
MetricUnitOfObservationType.SUPERVISION_OFFICER,
MetricUnitOfAnalysisType.STATE_CODE,
): """SELECT * FROM `{project_id}.sessions.supervision_officer_attribute_sessions_materialized`""",
): """
SELECT * FROM `{project_id}.sessions.supervision_staff_attribute_sessions_materialized`
WHERE "SUPERVISION_OFFICER" IN UNNEST(role_type_array)
""",
}

UNIT_OF_ANALYSIS_STATIC_ATTRIBUTE_COLS_QUERY_DICT: Dict[
Expand Down
Expand Up @@ -72,12 +72,13 @@ def staff_query_template(role: str) -> str:
supervisor_external_id,
attrs.specialized_caseload_type_primary AS specialized_caseload_type,
FROM ({source_tbl}) supervision_staff
INNER JOIN `{{project_id}}.sessions.supervision_officer_attribute_sessions_materialized` attrs
INNER JOIN `{{project_id}}.sessions.supervision_staff_attribute_sessions_materialized` attrs
ON attrs.state_code = supervision_staff.state_code AND attrs.officer_id = supervision_staff.external_id
LEFT JOIN UNNEST(attrs.supervisor_staff_external_id_array) AS supervisor_external_id
INNER JOIN `{{project_id}}.normalized_state.state_staff` staff
ON attrs.staff_id = staff.staff_id AND attrs.state_code = staff.state_code
WHERE staff.state_code = '{state}'
AND "SUPERVISION_OFFICER" IN UNNEST(attrs.role_type_array)
{f"AND {config.supervision_staff_exclusions}" if config.supervision_staff_exclusions else ""}
-- Get the staff's attributes from the most recent session
QUALIFY ROW_NUMBER() OVER(PARTITION BY attrs.state_code, attrs.officer_id ORDER BY COALESCE(attrs.end_date_exclusive, "9999-01-01") DESC) = 1
Expand Down
Expand Up @@ -177,9 +177,6 @@
from recidiviz.calculator.query.state.views.sessions.supervision_level_sessions import (
SUPERVISION_LEVEL_SESSIONS_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.sessions.supervision_officer_attribute_sessions import (
SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.sessions.supervision_officer_inferred_location_sessions import (
SUPERVISION_OFFICER_INFERRED_LOCATION_SESSIONS_VIEW_BUILDER,
)
Expand All @@ -189,6 +186,9 @@
from recidiviz.calculator.query.state.views.sessions.supervision_projected_completion_date_spans import (
SUPERVISION_PROJECTED_COMPLETION_DATE_SPANS_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.sessions.supervision_staff_attribute_sessions import (
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.sessions.supervision_super_sessions import (
SUPERVISION_SUPER_SESSIONS_VIEW_BUILDER,
)
Expand Down Expand Up @@ -343,7 +343,7 @@
SUPERVISION_LEVEL_SESSIONS_VIEW_BUILDER,
SUPERVISION_OFFICER_INFERRED_LOCATION_SESSIONS_VIEW_BUILDER,
SUPERVISION_OFFICER_SESSIONS_VIEW_BUILDER,
SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_BUILDER,
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_BUILDER,
SUPERVISION_UNIT_SUPERVISOR_SESSIONS_VIEW_BUILDER,
SUPERVISION_SUPER_SESSIONS_VIEW_BUILDER,
SUPERVISION_TOOL_ACCESS_SESSIONS_VIEW_BUILDER,
Expand Down
Expand Up @@ -30,17 +30,15 @@
from recidiviz.utils.environment import GCP_PROJECT_STAGING
from recidiviz.utils.metadata import local_project_id_override

SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_NAME = (
"supervision_officer_attribute_sessions"
)
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_NAME = "supervision_staff_attribute_sessions"

SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_DESCRIPTION = """
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_DESCRIPTION = """
View that preprocesses state staff periods to extract relevant attributes and external id's.
"""

# All dictionary values below should specify a list of values by which to sort rows for deduplication.
# All columns referenced in a given list should be queryable within the `sub_sessions_dedup` cte below.
_SUPERVISION_OFFICER_ATTRIBUTES_NO_OVERLAPS: Dict[str, List[str]] = {
_SUPERVISION_STAFF_ATTRIBUTES_NO_OVERLAPS: Dict[str, List[str]] = {
"supervision_district_id": [],
"supervision_district_name": [],
"supervision_office_id": [],
Expand All @@ -52,7 +50,7 @@
"supervision_office_id_inferred": [],
}

_SUPERVISION_OFFICER_ATTRIBUTES_WITH_OVERLAPS: Dict[str, List[str]] = {
_SUPERVISION_STAFF_ATTRIBUTES_WITH_OVERLAPS: Dict[str, List[str]] = {
"role_subtype": ["COALESCE(role_subtype_priority, 99)"],
"role_type": [],
"specialized_caseload_type": [],
Expand All @@ -63,7 +61,7 @@
"supervisor_staff_id": ["supervisor_staff_external_id", "supervisor_staff_id"],
}

SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_QUERY_TEMPLATE = f"""
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_QUERY_TEMPLATE = f"""
WITH all_staff_attribute_periods AS (
-- location periods
SELECT
Expand Down Expand Up @@ -217,12 +215,12 @@
end_date AS end_date_exclusive,
-- Apply an arbitrary dedup to attributes that we don't expect to overlap, mostly as an added protection
{generate_largest_value_query_fragment(
table_columns_with_priority_columns=_SUPERVISION_OFFICER_ATTRIBUTES_NO_OVERLAPS,
table_columns_with_priority_columns=_SUPERVISION_STAFF_ATTRIBUTES_NO_OVERLAPS,
partition_columns=["state_code", "staff_id", "start_date"],
)},
-- For attributes that might have overlap, dedup via the configured priority order and suffix with "_primary"
{generate_largest_value_query_fragment(
table_columns_with_priority_columns=_SUPERVISION_OFFICER_ATTRIBUTES_WITH_OVERLAPS,
table_columns_with_priority_columns=_SUPERVISION_STAFF_ATTRIBUTES_WITH_OVERLAPS,
partition_columns=["state_code", "staff_id", "start_date"],
column_suffix="_primary"
)},
Expand All @@ -247,7 +245,7 @@
{list_to_query_string(
[
f"ARRAY_AGG(DISTINCT {attr} IGNORE NULLS) AS {attr}_array"
for attr in _SUPERVISION_OFFICER_ATTRIBUTES_WITH_OVERLAPS
for attr in _SUPERVISION_STAFF_ATTRIBUTES_WITH_OVERLAPS
]
)}
FROM
Expand All @@ -260,7 +258,7 @@
SELECT
b.external_id AS officer_id,
a.*,
{list_to_query_string([f"c.{attr}_array" for attr in _SUPERVISION_OFFICER_ATTRIBUTES_WITH_OVERLAPS])},
{list_to_query_string([f"c.{attr}_array" for attr in _SUPERVISION_STAFF_ATTRIBUTES_WITH_OVERLAPS])},
FROM
sub_sessions_dedup a
LEFT JOIN
Expand All @@ -271,19 +269,17 @@
attribute_arrays c
USING
(state_code, staff_id, start_date)
WHERE
"SUPERVISION_OFFICER" IN UNNEST(role_type_array)
"""

SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_BUILDER = SimpleBigQueryViewBuilder(
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_BUILDER = SimpleBigQueryViewBuilder(
dataset_id=SESSIONS_DATASET,
view_id=SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_NAME,
view_query_template=SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_QUERY_TEMPLATE,
description=SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_DESCRIPTION,
view_id=SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_NAME,
view_query_template=SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_QUERY_TEMPLATE,
description=SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_DESCRIPTION,
clustering_fields=["state_code", "staff_id"],
should_materialize=True,
)

if __name__ == "__main__":
with local_project_id_override(GCP_PROJECT_STAGING):
SUPERVISION_OFFICER_ATTRIBUTE_SESSIONS_VIEW_BUILDER.build_and_print()
SUPERVISION_STAFF_ATTRIBUTE_SESSIONS_VIEW_BUILDER.build_and_print()
Expand Up @@ -55,13 +55,14 @@
a.start_date,
a.end_date_exclusive,
FROM
`{{project_id}}.sessions.supervision_officer_attribute_sessions_materialized` a,
`{{project_id}}.sessions.supervision_staff_attribute_sessions_materialized` a,
UNNEST(supervisor_staff_id_array) AS supervisor_staff_id
INNER JOIN
`{{project_id}}.normalized_state.state_staff` b
ON
a.state_code = b.state_code
AND supervisor_staff_id = b.staff_id
WHERE "SUPERVISION_OFFICER" IN UNNEST(a.role_type_array)
)
,
overlapping_spans AS (
Expand Down
Expand Up @@ -127,7 +127,7 @@
attrs.supervisor_staff_external_id_array[SAFE_OFFSET(0)] AS supervisor_external_id,
attrs.supervisor_staff_external_id_array AS supervisor_external_ids,
FROM full_query
LEFT JOIN `{{project_id}}.sessions.supervision_officer_attribute_sessions_materialized` attrs
LEFT JOIN `{{project_id}}.sessions.supervision_staff_attribute_sessions_materialized` attrs
ON full_query.id = attrs.officer_id
AND full_query.state_code = attrs.state_code
AND {today_between_start_date_and_nullable_end_date_clause("start_date", "end_date_exclusive")}
Expand Down

0 comments on commit 48ecca1

Please sign in to comment.