Skip to content
This repository has been archived by the owner on May 31, 2024. It is now read-only.

Syncing across repos

Melissa Braxton edited this page Feb 1, 2023 · 4 revisions

This product is part of a collection of products that are similarly managed. In order to simplify the product management across these different repositories, we've added some automations that allow us to keep parts of the repos synchronized.

Issue and pull request templates

We use issue and pull request templates for consistency and for preserving important knowledge:

  • Issue templates help us ensure that our issues have the information necessary for a teammate to pick it up and go work on it without having to track down the author to ask for more context.
  • Pull request templates help us remind teammates of all the necessary steps before accepting product changes, such as accessibility reviews, outreach approval, etc. Both of these templates capture knowledge that the people maintaining these products have learned over time.

We keep the issue and pull request templates in the various repositories synchronized so that people do not have to remember to manually make the changes in multiple repos. Instead, changing, adding, or deleting a template will cause new pull requests to be opened on all the other repositories automatically. Then people only need to approve and merge them rather than manually authoring the changes.

The synchronization is handled by a GitHub Action workflow that executes on every push to a repo's default branch (usually main, but depends on the repo). The action is located at .github/workflows/sync_templates.yml. The workflow is three steps.

1. Get a temporary access token

In order to interact with multiple repos, we can't rely on either the automatic token created by the Action or a personal access token. Instead, we use a GitHub App to generate a short-term token for us. We use a third-party action called github-app-token to take the App credentials and use them to fetch an access token.

The github-app-token action needs to know the ID, installation ID, and private key of the GitHub App. Contact #admins-github on Slack to get those credentials.

2. Checkout the repo

To understand what needs to be synchronized, we must checkout the code. So... we do that! This step is super easy.

3. Synchronize changes to other repos

Finally, we use the repo-file-sync action to compare this repo with the others to see if our new changes need to be pushed outwards. This action relies on a synchronization configuration file to inform it about what to sync and how.

The sync config file is located at .github/sync.yml. Documentation describing the config file is available on the action repo page. The current configuration is a single "group" that lists all of the repos that should be synchronized along with the files to be synced. The sync config file itself is also synced across repos to help keep everything configured properly.

NOTE: This synchronization is one-way. Changes are pushed to other repos, not pulled in.

Labels

We also synchronize labels between our various product repos. This happens with a GitHub Actions workflow that runs nightly. The workflow is located at .github/workflows/sync-labels.yml. It uses the label-sync action. This synchronization is a pull rather than a push.

There is a GitHub workflow trigger called label that is triggered any time a label is created, edited, or deleted. Ideally our workflow would run on the label trigger instead of nightly. However, because the label-sync action is pull, that does not work - if we add a label to this repo and then run the workflow on this action, it will attempt to pull labels from the other repos, but the new label won't exist on them yet. There is an open issue on the label-sync action to allow pushing changes to other repos. If that is implemented, then we should switch to using the label trigger instead of the schedule trigger.

Wiki pages

Some pages of the wikis in this collection of repos are also synced. These are the pages that should be consistent across the project:

  • our guiding principles
  • our release practices
  • our approach to testing
  • our approach to synchronization
  • the members of our team

The repo-file-sync-action relies on the GitHub API to create git trees, but wiki repos are not accessible via the API. Instead, for this method, we do the synchronization through a small amount of custom code in a GitHub Actions workflow.

Our workflow is configured to run in a matrix strategy. Our matrix consists of all the repos that should participate in syncing. Setting up a matrix strategy means that our workflow will run multiple times - once for each permutation of the provided matrix. Our matrix is one-dimensional, so the total number of workflow runs is equal to the number of repos we want to sync.

The workflow is configured to be triggered by changes to wiki pages using the gollum event. Each workflow run follows this sequence of steps:

1. Get a temporary access token

In order to interact with multiple repos, we can't rely on either the automatic token created by the Action or a personal access token. Instead, we use a GitHub App to generate a short-term token for us. We use a third-party action called github-app-token to take the App credentials and use them to fetch an access token.

The github-app-token action needs to know the ID, installation ID, and private key of the GitHub App. Contact #admins-github on Slack to get those credentials.

2. Checkout the source wiki that was just changed

Our workflow is triggered by a wiki changing, so we'll start by checking out the changed wiki to use as the source of truth. We determine the git URL of the wiki by adding .wiki to the repo name provided by GitHub when the workflow runs.

3. Checkout the target wiki to update

The target wiki is determined by the target value in the matrix. Each workflow job run within a single workflow run will have a unique target value. For example, if the matrix target was [18f/repo-one, 18f/repo-two], there would be two workflow job runs. The first's target would be 18f/repo-one and the second's target would be 18f/repo-two.

The target wiki is the one that will be updated from the source-of-truth wiki from step 2.

4. Copy the synchronized files from the source to the target

The list of files that should be synchronized across repos is defined in this step. We use rsync to copy those files from the source wiki to the target repo.

We use rsync because it is better at preventing file corruption than the cp command. rsync uses checksums to ensure that the copied data is not changed from the source data. It also does partial file transfers which is often faster, but given the small number of files and their nature, that is likely neglible for us.

5. Push any changes to the target wiki

After the files are copied, we use git to determine if any files have changed. If they have, then we commit and push them directly to the target wiki. Because there is no pull request interface for wikis, the changes are made directly to the wiki's default branch.

NOTE: This synchronization is one-way. Changes are pushed to other repos, not pulled in.