Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove force failover. Not sure what to do about verifying mirror? #4108

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 3 additions & 4 deletions source/manual/2nd-line-drills.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,11 @@ You can run this in any environment, as you're only running `plan` - not `apply`

On Integration or Staging, follow the [Restore an RDS instance via the AWS CLI](/manual/howto-backup-and-restore-in-aws-rds.html#restore-an-rds-instance-via-the-aws-cli) instructions for an app of your choice.

## Force failover to GOV.UK mirror and Emergency publishing using the GOV.UK mirror
## Emergency publishing using the GOV.UK mirror

1. Warn in `#govuk-2ndline-tech` that you're about to do this, as it will lead to a spike in alerts and will also break continuous deployment for a while (due to Smokey failures).
1. Follow the [Forcing failover to the GOV.UK mirrors](/manual/fall-back-to-mirror.html#forcing-failover-to-the-gov-uk-mirrors) instructions on Integration or Staging.
1. To verify that it worked, visit a page at random and [purge the page from cache](/manual/purge-cache.html). Reload the page, to see the 'mirrored' version of the content. NB: you wouldn't do this in a real incident, as we'd want to serve Fastly's cached version for as long as possible.
1. Undo your changes to have Nginx handling requests again.
1. Follow the [Emergency publishing content using the GOV.UK mirror](/manual/fall-back-to-mirror.html#emergency-publishing-content-using-the-gov-uk-mirror) instructions on Integration or Staging.
1. To verify that it worked, ???

## Drill logging into accounts

Expand Down
27 changes: 0 additions & 27 deletions source/manual/fall-back-to-mirror.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,33 +71,6 @@ If the `govuk_seed_crawler` cronjob fails to run:
- The [RabbitMQ dashboard](https://grafana.blue.production.govuk.digital/dashboard/file/rabbitmq.json?refresh=10s&orgId=1) will show fewer jobs (or no jobs at all) being published to the `govuk_crawler_queue` queue.
- [Monitoring for the `cache_public_web_acl` ACL](https://us-east-1.console.aws.amazon.com/wafv2/homev2/web-acl/cache_public_web_acl/d9033e40-69e8-4bbc-a61a-cd3c50254d04/overview?region=eu-west-1) on AWS WAF will show a reduced number of requests to the cache machines (`govuk-infra-cache-requests AllowedRequests`).

## Forcing failover to the GOV.UK mirrors

If Origin is unavailable, Fastly will automatically retry every request against the mirrors.

To avoid Fastly traffic hitting Origin when Origin is down (potentially making the problem worse), we can [fall back to AWS CloudFront](/manual/fall-back-to-aws-cloudfront.html), which serves all content using the GOV.UK mirrors.

Alternatively, we can stop [Nginx](https://www.nginx.com/) on the cache machines, which will prevent requests hitting GOV.UK applications. Fastly will automatically retry these failed requests against the mirror.

SSH into each cache machine (you can increment box number after the colon to hit each one in turn):

```bash
$ gds govuk connect -e production ssh cache:1
```

Stop Nginx to force use of mirrors:

```bash
$ govuk_puppet --test --disable "fail_to_mirror task (by $USER)"
$ sudo service nginx stop
```

When required you can re-enable puppet, which will restart Nginx:

```bash
$ govuk_puppet --test --enable
```

## Emergency publishing content using the GOV.UK mirror

The escalation on-call contact will tell you if you need to make changes to GOV.UK while Origin is unavailable. To do this, you must change content on the GOV.UK mirrors. Because the mirror is static HTML, it's hard to make broad changes to the site, like putting a banner on every page.
Expand Down