[CEP] Environment-specific branching configuration (eg staging.yml) #27990

esoergel · 2020-06-29T20:21:10Z

Abstract
While most environments usually run master, the staging environment runs a composite branch defined by staging.yml. This CEP proposes to extend that model for all environments. Each environment will have its own branches.yml file which will define what code should be deployed.

Motivation
Production environments usually just deploy whatever the latest master commit is. There are some notable exceptions to this rule:

Hotfix deploys (for urgent, simple bugfixes, or to minimize risks of weekend deploys)
Code freezes (for auditing purposes or for increased stability during critical periods)
Staging environments (an ICDS staging environment is in the works)

Communicating these exceptions is informal, and coordination is tricky - we've spun up docs in the past detailing what should be deployed like commit abc123 plus PR 456 plus PR 458. The aim of this CEP is to formalize that process and make use of existing tooling in managing this.

Specification

Make rebuildstaging a standalone tool (name TBD, something like combine-branches?)
a. Have this tool accept an environment-specific config file and operate on a repository specified in an environment variable.
Move staging.yml to the staging environment config directory and rename branches.yml
Modify rebuildstaging to not build anything if name is master
Modify the deploy scripts to deploy the branch specified in this file.
a. This should fall back to master if the env doesn't have a branches.yml
a. _confirm_deploying_same_code() should be updated to reflect these changes.

After the above changes, the state of the world would look like:

Staging would be much the same, but with staging.yml in a different location and a slightly different command to rebuild.
Access to staging configurations can be granted by environment-specific repositories. For example, ICDS staging could be maintained apart from commcare-hq and commcare-cloud, meaning contractors could be granted access without obtaining write access to the HQ repository. [this would require specifying an alternate remote repository as well]
In normal operation, production environments would be set with trunk and name to master, and the rest of the file blank. This would result in a normal deploy of the latest master.
For hotfixes and code freezes, trunk would be set to the last full deploy commit, name would be a hotfix branch name, and code to be included would be referenced as additional branches.
- Running combine-branches (or whatever) will rebuild the hotfix branch by adding the specified changes on top of the last deploy commit.
- The deploy scripts already alert you if there's no code change. This means that if an environment is on a branch other than master, it will have to be manually updated to pick up changes, but the deploy script will warn you if you forget.

Impact on users
n/a

Impact on hosting

Backwards compatibility
👍

Release Timeline

Open questions and issues
As frequently evidenced on staging, dynamically combining branches can introduce conflicts, including some that aren't identified by git. Should we be concerned about an increased reliance on this model? As envisioned, tests won't be run on the resulting branch before it's deployed. How can we minimize the risks associated with this? Would mandating that branches be merged in to master before being added to a branches.yml help?

The text was updated successfully, but these errors were encountered:

snopoke · 2020-06-30T09:29:05Z

Access to staging configurations can be granted by environment-specific repositories. For example, ICDS staging could be maintained apart from commcare-hq and commcare-cloud, meaning contractors could be granted access without obtaining write access to the HQ repository.

Can you elaborate on how this would work. Won't you still need access to push the updated branch to Github before deploy?

millerdev · 2020-06-30T18:26:17Z

For hotfixes and code freezes, trunk would be set to the last full deploy commit, name would be a hotfix branch name, and code to be included would be referenced as additional branches.

Does this mean branches.yml would need to be inspected (and possibly updated) before each deploy to ensure it is in an acceptable state? For example, the contents of the file would be changed for a hotfix deploy, and then those changes would need to be reverted later before the next regular deploy?

I think our staging deployment model is sufficiently complicated that it is not ideal for production environments. There are a lot of things that can go wrong. As you point out, conflicts and untested code combinations are an issue. It is not usually a big deal if staging goes down for a short time or has buggy areas (as long as those areas are not being tested).

esoergel · 2020-06-30T22:14:21Z

@snopoke

Won't you still need access to push the updated branch to Github before deploy?

Doh, that was an oversight. I suppose part of the branching config could be to define a remote repo to use for the deploy - that would make a lot of sense, actually. I think I'll consider that out of scope for this change though and focus on the Dimagi internal use-cases.

@millerdev

Does this mean branches.yml would need to be inspected (and possibly updated) before each deploy to ensure it is in an acceptable state?

Sort of. It should be safe to assume that you can just run deploy like normal, but if the last deploy were a hotfix, you wouldn't be able to deploy unless you either updated the hotfix branch (likely via combine-branches) or modified branches.yml. This is a positive, actually, since it's a barrier to unknowningly violating a code freeze. I can't think of a scenario where forgetting about branches.yml would get you in to trouble, but let me know if I'm overlooking something.

I think our staging deployment model is sufficiently complicated that it is not ideal for production environments.

I agree it would be dangerous to do more frequent unorthodox deploys, especially if cherry-picking commits is involved. However we already do this on occasion in an ad-hoc fashion, and it seems safer to be rigorous about it when it is necessary. Would you agree with that? What do you think would be the safest way to manage code freezes and hot fixes?

millerdev · 2020-07-01T19:47:15Z

Do I have this right? In normal flow the production branches.yml file stays in the same (very simple) state, and deploys can happen without changing it. However, when there is a hotfix deploy then the file gets modified in a way that (a) makes it possible to safely do the hotfix deploy, and (b) will not allow a normal production deploy to happen again without special action to move back from hotfix mode into normal production flow.

If that is all correct then it seems reasonable given that it only imposes extra steps around the hotfix workflow, and that seems like a good thing.

Except for hotfixes (where it happens anyway), I'm still ambivalent toward using a combine-branches pattern on any production environment because it means that environment will likely be running untested code combinations.

esoergel · 2020-07-01T20:56:44Z

@millerdev yup, that's all correct.

Except for hotfixes (where it happens anyway), I'm still ambivalent toward using a combine-branches pattern on any production environment because it means that environment will likely be running untested code combinations.

Totally agree. My intent (I'll see if I can clarify wording in the original post) is that the "normal flow" case means that we just deploy master. combine-branches would only be used for hotfixes and freezes, and only run manually, and it wouldn't output to master (we may need an additional check to block that).

millerdev · 2020-07-01T21:03:33Z

+1

snopoke · 2020-07-29T14:17:47Z

@esoergel currently the tool is executed in the checked out repo that is the source for the branch. Moving this to commcare-cloud would mean the script would need to either checkout the repo first or else be given a local path to the repo.

Do you have a sense of which of those 2 would be better?

esoergel · 2020-07-29T16:01:49Z

@snopoke I was imagining the latter. I wrote in the spec that the local path could be provided via an environment variable, but I'd be open to alternate approaches.

Making a new clone of the repo does have some appeal though - it'd be nice to avoid messing with a local development copy. I think HQ is too big to clone fresh each time, but I could see a case made for this tool to store a local copy for its own internal use. What do you think?

snopoke · 2020-07-30T07:20:04Z

+1 to supplying the path to the repos - I think if you want to use separate clones for this you can always check out a copy yourself. For small repos it might be nice to have the tool clone it but for HQ I don't think it makes sense.

I'm likely to start doing some work related to this since ICDS is soon going to have 2 repos to manage so it makes sense to move this tool of the HQ repo. For now I'm likely just going to put it into commcare-cloud.

esoergel · 2020-07-30T20:27:49Z

Awesome, that'll be exciting to see.

if you want to use separate clones for this you can always check out a copy yourself.

Very good point

snopoke · 2020-08-03T11:56:58Z

initial work: dimagi/commcare-cloud#4050

snopoke · 2020-09-03T12:27:04Z

Created external tool to build the branches: https://github.com/dimagi/git-branch-builder

snopoke · 2020-09-03T15:11:36Z

migrated HQ scripts to use external tool: #28473

esoergel added the CEP CommCare Enhancement Proposal label Jun 29, 2020

snopoke mentioned this issue Jul 1, 2020

Updates in ICDS staging deploy process #27998

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CEP] Environment-specific branching configuration (eg staging.yml) #27990

[CEP] Environment-specific branching configuration (eg staging.yml) #27990

esoergel commented Jun 29, 2020 •

edited

snopoke commented Jun 30, 2020

millerdev commented Jun 30, 2020

esoergel commented Jun 30, 2020

millerdev commented Jul 1, 2020

esoergel commented Jul 1, 2020

millerdev commented Jul 1, 2020

snopoke commented Jul 29, 2020

esoergel commented Jul 29, 2020

snopoke commented Jul 30, 2020

esoergel commented Jul 30, 2020

snopoke commented Aug 3, 2020

snopoke commented Sep 3, 2020

snopoke commented Sep 3, 2020

[CEP] Environment-specific branching configuration (eg staging.yml) #27990

[CEP] Environment-specific branching configuration (eg staging.yml) #27990

Comments

esoergel commented Jun 29, 2020 • edited

snopoke commented Jun 30, 2020

millerdev commented Jun 30, 2020

esoergel commented Jun 30, 2020

millerdev commented Jul 1, 2020

esoergel commented Jul 1, 2020

millerdev commented Jul 1, 2020

snopoke commented Jul 29, 2020

esoergel commented Jul 29, 2020

snopoke commented Jul 30, 2020

esoergel commented Jul 30, 2020

snopoke commented Aug 3, 2020

snopoke commented Sep 3, 2020

snopoke commented Sep 3, 2020

esoergel commented Jun 29, 2020 •

edited