Sets up A) a general messaging topic in which to hook up additional event listeners and B) a little analytics service that feeds all events into a database. The database can be further plugged into a dashboard frontend like Superset or Metadash.
Features include
- Terraform-managed Lambdas (inspired by this blog post) + a helper script for most common management tasks
- Multi-environment deployment with separate Terraform states
- Local tests using moto to mock AWS endpoints
- Keeping development dependencies out of deployed Lambda zips with pipenv
- Overriding dependencies with Lambda-compatible precompiled packages on deployment (e.g.
precompiled/psycopg2/
)
The infra consists of the following Terraform modules:
shared
sets up a Postgres micro-instance and a security group allowing external access. As the name suggests, we're sharing one database with all stages in order to not exceed AWS free tier's 1 micro instance limit; instead stage separation happens at schema level.messaging
sets up an SNS topic for events and an IAM user with publishing permissions. Event producers should use this IAM user's API keys.analytics_db
sets up stage-specific schemas and roles in the shared database.analytics_queue
sets up an SQS queue consuming the events SNS topic. A scheduled Cloudwatch event triggers a consumer Lambda function every 5 minutes. The consumer pulls events out of the SQS queue and fans them out to a worker Lambda which in turn feeds events into the shared database.
- AWS Command Line Interface
- invoke
- pip-env >= 8.3.0
- terraform
- awslogs (optional)
- An S3 bucket for storing Terraform remote state
First set up the shared environment:
- Navigate to the shared environment directory
infrastructure/shared
. - Create a
terraform.tfvars
secret variables file. Checkinfrastructure/shared/terraform.tfvars.sample
for an example. These files should naturally be kept outside version control. - Run
terraform init
to set up a Terraform working directory. You'll be prompted for the name of your remote state bucket. Alternatively you can define the bucket with a-backend-config='bucket=[BUCKET NAME]'
argument. - Run
terraform apply
to build the infra.
Now with the shared infra set up, you can provision individual stages. For example to set up dev
:
- First build the Lambda functions:
inv build
. - Navigate to the environment directory: e.g.
cd infrastructure/dev
. - Repeat the above steps 2-4 to set up the stage-specific resources.
- Initialize the database:
inv init-db --env dev
.
And we're set! Replace dev
with staging
, prod
, etc to set up additional stages.
Various management tasks are defined in tasks.py
. The default environment (--env
) and AWS profile can be configured in invoke.yaml
.
Run inv --list
to see a summary of all available tasks. The most important tasks are as follows:
inv build --func [FUNCTION]
: Build a function. Builds all if --func
is not specified.
inv test --func [FUNCTION]
: Run a function's tests. Tests all functions if --func
is not specified.
inv invoke [FUNCTION] --env [ENV] --payload [PAYLOAD]
: Invoke a deployed function.
Example:
inv invoke analytics_worker --env staging --payload '[{"event_id": 12}]'
inv update [FUNCTION] --env [ENV]
: Quickly update function code without rebuilding dependencies.
Due to a bug in the Terraform Postgres provider, changing db password variables doesn't actually result in a password update. As a workaround you can manually DROP ROLE ...
via psql
and re-apply terraform.
- Local invocation
- Set up alembic or similar to manage DB migrations
- An .ignore file to configure files kept out of Lambda zips