Skip to content

Updating BigQuery Views

caroletouma edited this page Oct 26, 2021 · 2 revisions

It is quick and easy to add or update BigQuery views, but order of operations should be deliberate in the event that you are updating a view consumed in a downstream API, such as a metric package. This page uses updating a view consumed as a metric package for the Dashboard as a concrete example where relevant.

Adding a new view

  1. First, use the BigQuery interface to build and test the query for the new view

  2. Once the query is ready, make a new view in recidiviz-data/recidiviz/calculator/bq/dashboard/views/

  3. Be sure to add your view to the *_views.py file in the directory where your new view lives

  4. If updating a metric meant to be consumed as a metric package, run the dashboard_export_manager script locally with the appropriate bucket name, e.g. the staging dashboard metric package bucket

    1. Be sure that your local environment has the environment variable GOOGLE_CLOUD_PROJECT=recidiviz-staging
    2. If this succeeds, it will create a view in the appropriate dataset in BigQuery, e.g. the staging dashboard views dataset, and export the resulting file to the staging dashboard metric package bucket
  5. From there, update any downstream consumer that needs to read from the updated view. For the dashboard, add the name of the metric package file to the server/core/metricsApi.js to start retrieving via the app's API

  6. Thus, adding a new view that involves a downstream consumer update should typically require one pull request in recidiviz-data for the BigQuery view and one in the downstream app for the consumption

    1. DO NOT merge the app pull request before the recidiviz-data pull request
    2. Once both PRs are merged into main, you must wait until there has been a new deploy of the recidiviz-data platform to production before deploying a new version of the app that relies on the new query
    3. Double check that the file that the new logic depends on is in the appropriate metric storage bucket before deploying the dashboard to production
      1. Dashboard view data is exported every morning around 6am PT. To trigger a manual export of the dashboard view data, publish a message (any message) to the appropriate Pub/Sub topic

Updating the values on an existing chart

  1. First, use the BigQuery interface to edit and test the query of the view for the existing chart (however, don’t ever Save View and overwrite an existing view from within the BigQuery interface)
  2. Once the query is ready, make a NEW view in recidiviz-data/recidiviz/calculator/bq/dashboard/views/ with the updated query, with a different name than the existing view
  3. Be sure to add this view to the *_views.py file in the directory where the view lives
  4. This pull request can be merged and deployed to prod without waiting for any downstream consumer changes
  5. When it's time to transition the existing logic in the downstream app, run the dashboard_export_manager script locally with the appropriate bucket name, e.g. the staging dashboard metric package bucket
    1. Be sure that your local environment has the environment variable GOOGLE_CLOUD_PROJECT=recidiviz-staging
    2. If this succeeds, it will create a view in the appropriate dataset in BigQuery, e.g. the staging dashboard views dataset, and export the resulting file to the staging dashboard metric package bucket
  6. From there, update any downstream consumer that needs to read from the updated view. For the dashboard, add the name of the metric package file to the server/core/metricsApi.js to start retrieving via the app's API
    1. For the dashboard, update the app to start reading from this new file instead of its old view data file
    2. Once the recidiviz-data pull request has been deployed to prod, and the new file is in the appropriate metric package bucket, issue a pull request with the changes to the app
    3. DO NOT merge the downstream pull request before the recidiviz-data pull request has been deployed to prod and the file is in the cloud storage bucket in prod
      1. Dashboard view data is exported every morning around 6am PT. To trigger a manual export of the dashboard view data, publish a message (any message) to the appropriate Pub/Sub topic
    4. Then, issue a NEW PR in recidiviz-data that deletes the old view that's no longer being used by the dashboard. Once this is merged and deployed, delete the old file from any metric storage bucket it lives within