Skip to content

Release and Deploy

caroletouma edited this page Oct 26, 2021 · 3 revisions

Release

New release candidates can be created by simply tagging the repository at a certain commit, or more commonly at the current head. You can do this with the git CLI or the Github UI (Code > Releases). This must be done against the private version of the repository, not the public mirror on which this Wiki is hosted.

Because the release is a tag against our monolithic platform repository which contains multiple services, one release encompasses the entire platform, though deploys vary by service.

When creating a new release, make sure to do the following:

  • Set a tag version using a semantic versioning format, e.g. v1.1.15.
    • Generally, increment the rightmost digit. Increment the middle digit when there has been a major milestone release. Increment the leftmost digit only when there is a platform-shifting, backwards-incompatible shift. This should practically never happen.
  • Choose a target branch, i.e. main if creating a release you intend to eventually deploy it to production
  • Set a release title based on the current date, e.g. 2019-10-06 Release. If multiple releases are created in a day, append a number to the end, e.g. 2019-10-06 Release #2.
  • Add brief release notes indicating the changes in this release candidate relative to the most recent release. If this release is experimental and not meant to be deployed to production, make that very clear in the notes.

After the release is cut, it will be automatically deployed to the staging environment via our Travis CI build.

Deploy

Note: unlike many operations, deploy commands generally do not need to be run from inside of pipenv shell.

Main platform application layer

The main platform application layer is deployed to Google App Engine. There are two tiers: staging and production.

Staging

Deploys to staging are done automatically when a new release is prepared, via the /travis/deploy_staging.sh script. This script grabs the encrypted authentication keys necessary for the deploy, deploys the cron.yaml configuration, deploys updated Cloud Tasks queue configuration, deploys the specified version of the application Docker image to Cloud Registry, and then deploys the application itself with reference to the Docker image. The deploy process can be monitored via the logs on the Travis CI build.

From local

Deploys to staging can also be done directly from an authorized developer's local machine, via the deploy_local_to_staging.sh script. This is useful to experiment with possible changes that are not yet ready for primetime. This script is functionally the same as the automated staging deploy, with the following differences:

  • It does not deploy updates to cron.yaml.
  • It also deploys all configured calculation pipelines to Cloud Dataflow staging (separately from the main application layer on GAE).
  • It builds the Docker image based on the current state of the repository on your local machine
  • It deploys with the --no-promote flag, which means that traffic is not automatically split to the new version. Thus, it is generally safe to deploy experimental builds to staging from local machines without impacting ongoing operations in staging (though if you update persistent state outside of GAE, then you may require coordination).

The command is run as such: ./deploy_local_to_staging.sh v1.1.15

Production

When a release has been thoroughly tested in staging, it can be deployed to production. This is done from an authorized developer's local machine, via the deploy_production.sh script. This script is functionally the same as the automated staging deploy, with the only difference between that it also deploys all configured calculation pipelines to Cloud Dataflow staging (separately from the main application layer on GAE).

The command is run as such: ./deploy_production.sh v1.1.15

Dataflow pipelines

As noted above, deploys to staging from local machines and deploys to production automatically deploy calculation pipeline updates to Dataflow in the correct environment, staging or production respectively. In either case, this uses the current state of the calculation pipeline code and configuration in the repository.

If you need to deploy calculation pipelines specifically and nothing else, you can use the deploy_pipeline_to_template.sh script directly. This script relies on the pipeline configuration in the calculation_pipeline_templates.yaml configuration, also in the root of the repository. To parse this file, you need to have sourced a separate script. Thus:

  1. source ./recidiviz/tools/deploy_pipeline_from_yaml.sh
  2. deploy_pipelines ./deploy_pipeline_to_template.sh recidiviz-staging recidiviz-staging-dataflow-templates ./calculation_pipeline_templates.yaml
    1. Change those command-line args to production as necessary

Cloud functions

TODO: details

PDF Parsing service

TODO: details

Operations

When deploying a release, the following resources are available to monitor the deploy success:

  • Travis CI log output for automated staging deploys
  • Local log output (stdout and stderr) for manual staging deploys and production deploys
  • The GAE Versions view to see which versions are running, with how many instances, deployed by whom, with or without traffic allocation
    • For this and the views listed below, make sure the correct project is specified in the top-level dropdown
  • The GCP Logging view, tuned to "GAE Application, Default Service, All version_id, All logs" to see log events corresponding to both successful app startup and issues on the VM itself (should be in stderr)
    • In the logging view, you can specify specific versions
  • The "Tasks queues" and "Cron jobs" tabs in the left-hand menu of the GAE UI to double-check any expected updates to queue or cron configuration

You can split traffic to a particular version, to roll back to that version or to stop diverting traffic as an experiment, in the GAE Versions view. You can also stop or delete prior versions for cleanup purposes from the same view.