-
Create a new Google Compute Engine instance from the
sdow-web-server
instance template, which is configured with the following specs:- Name:
sdow-web-server-1
- Zone:
us-central1-c
- Machine Type: f1-micro (1 vCPU, 0.6 GB RAM)
- Boot disk: 32 GB SSD, Debian GNU/Linux 10 (buster)
- Notes: Click "Set access for each API" and use default values for all APIs except set Storage to "Read Write".
- Name:
-
Set the default region and zone for the
gcloud
CLI:$ gcloud config set compute/region us-central1 $ gcloud config set compute/zone us-central1-c
-
SSH into the machine:
$ gcloud compute ssh sdow-web-server-# --project=sdow-prod
-
Install required operating system dependencies to run the Flask app:
$ sudo apt-get -q update $ sudo apt-get -yq install git pigz sqlite3 python-pip $ sudo pip install --upgrade pip setuptools virtualenv # OR for Python 3 #$ sudo apt-get -q update #$ sudo apt-get -yq install git pigz sqlite3 python3-pip #$ sudo pip3 install --upgrade pip setuptools virtualenv
-
Clone this directory via HTTPS and navigate into the repo:
$ git clone https://github.com/jwngr/sdow.git $ cd sdow/
-
Create and activate a new
virtualenv
environment:$ virtualenv -p python2 env # OR virtualenv -p python3 env $ source env/bin/activate
-
Install the required Python libraries:
$ pip install -r requirements.txt
-
Copy the latest compressed SQLite file from the
sdow-prod
GCS bucket:$ gsutil -u sdow-prod cp gs://sdow-prod/dumps/<YYYYMMDD>/sdow.sqlite.gz sdow/
-
Decompress the SQLite file:
$ pigz -d sdow/sdow.sqlite.gz
-
Create the
searches.sqlite
file:$ sqlite3 sdow/searches.sqlite ".read sql/createSearchesTable.sql"
Note: Alternatively, copy a backed-up version of
searches.sqlite
:$ gsutil -u sdow-prod cp gs://sdow-prod/backups/<YYYYMMDD>/searches.sql.gz sdow/searches.sql.gz $ pigz -d sdow/searches.sql.gz $ sqlite3 sdow/searches.sqlite ".read sdow/searches.sql" $ rm sdow/searches.sql
-
Install required operating system dependencies to generate an SSL certificate (this and the following instructions are based on these blog posts):
$ sudo apt-get -q update $ sudo apt-get -yq install nginx certbot python-certbot-nginx
-
Add this
location
block inside theserver
block in/etc/nginx/sites-available/default
:location ~ /.well-known { allow all; }
-
Start NGINX:
$ sudo systemctl restart nginx
-
Ensure the VM has been assigned the proper static IP address (
sdow-web-server-static-ip
) by editing it on the GCP console. -
Create an SSL certificate using Let's Encrypt's
certbot
:$ sudo certbot certonly -a webroot --webroot-path=/var/www/html -d api.sixdegreesofwikipedia.com --email wenger.jacob@gmail.com
-
Ensure auto-renewal of the SSL certificate is configured properly:
$ sudo certbot renew --dry-run
-
Run
crontab -e
and add the following cron jobs to that file to auto-renew the SSL certificate, regularly restart the web server (to ensure it stays responsive), and backup the searches database weekly:# Renew the cert daily. 0 4 * * * sudo /usr/bin/certbot renew --noninteractive --renew-hook "sudo /bin/systemctl reload nginx" # Restart the server every ten minutes. */10 * * * * /home/jwngr/sdow/env/bin/supervisorctl -c /home/jwngr/sdow/config/supervisord.conf restart gunicorn # Backup the searches database weekly. 0 6 * * 0 /home/jwngr/sdow/scripts/backupSearchesDatabase.sh
Note: Let's Encrypt debug logs can be found at
/var/log/letsencrypt/letsencrypt.log
.Note: Supervisor debug logs can be found at
/tmp/supervisord.log
. -
Replace the
ExecStart
line in/lib/systemd/system/certbot.service
with the following to ensure NGINX restarts every time a new certificate is generated:ExecStart=/usr/bin/certbot -q renew --noninteractive --renew-hook "sudo /bin/systemctl reload nginx"
-
Run the following commands to restart
certbot
and ensure the new timer is enabled:$ sudo systemctl daemon-reload $ sudo systemctl restart certbot.service $ sudo systemctl restart certbot.timer
-
Install a mail service in order to read logs from cron jobs:
$ sudo apt-get -yq install postfix # Choose "Local only" and use the default email address.
Note: Cron job logs will be written to
/var/mail/jwngr
. -
Generate a strong Diffie-Hellman group to further increase security (note that this can take a couple minutes):
$ sudo openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048
-
Copy over the NGINX configuration, making sure to back up the original configuration:
$ sudo cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.backup $ sudo cp ./config/nginx.conf /etc/nginx/nginx.conf
-
Restart
nginx
:$ sudo systemctl restart nginx
-
Install the Stackdriver monitoring agent:
$ curl -sSO https://dl.google.com/cloudagents/add-monitoring-agent-repo.sh $ sudo bash add-monitoring-agent-repo.sh $ sudo apt-get update $ rm add-monitoring-agent-repo.sh $ sudo apt-get -yq install stackdriver-agent $ sudo service stackdriver-agent start
-
Activate the
virtualenv
environment:$ cd sdow/ $ source env/bin/activate
-
Start the Flask web server via Supervisor which runs Gunicorn:
$ cd config/ $ supervisord
-
Use
supervisorctl
to manage the running web server:$ supervisorctl status # Get status of running processes $ supervisorctl stop gunicorn # Stop web server $ supervisorctl start gunicorn # Start web server $ supervisorctl restart gunicorn # Restart web server
Note:
supervisord
andsupervisorctl
must be run from theconfig/
directory or specify the configuration file via the-c
argument or else they will return an obscure"http://localhost:9001 refused connection"
error message.Note: Log output from
supervisord
is written to/tmp/supervisord.log
and log output fromgunicorn
is written to/tmp/gunicorn-stdout---supervisor-<HASH>.log
. Logs are also written to Stackdriver Logging.
To update the web server to a more recent sdow.sqlite
file with minimal downtime, run the
following commands after SSHing into the web server:
$ cd sdow/
$ source env/bin/activate
$ gsutil -u sdow-prod cp gs://sdow-prod/dumps/YYYYMMDD/sdow.sqlite.gz sdow/sdow_new.sqlite.gz
$ pigz -d sdow/sdow_new.sqlite.gz # This takes ~5 minutes and causes search to be non-responsive.
$ mv sdow/sdow_new.sqlite sdow/sdow.sqlite
$ cd config/
$ supervisorctl restart gunicorn
To update the Python server code which powers the SDOW backend, run the following commands after SSHing into the web server:
$ cd sdow/
$ source env/bin/activate
$ git pull
$ pip install -r requirements.txt
$ cd config/
$ supervisorctl restart gunicorn