UMCCR Data Portal Backend API

Cloud native serverless backend API for UMCCR Data Portal Client.

OpenAPI documentation available here. See User Guide for API usage.

Local Development

TL;DR

Policy: We tend to keep up as with AWS Lambda Runtimes upon every Portal major release cycle. You can have higher version in your local but; encourage to stay within upper bound of AWS Lambda Runtimes as this is where we are going to run our code. See buildspec.yml and serverless.yml for runtime version requirement.

Required:
- Python
- Node.js with Yarn
- Docker
- Active AWS SSO login session (e.g. aws sso login --profile dev && export AWS_PROFILE=dev)
Create virtual environment; use either built-in python -mvenv venv or conda.

docker --version
Docker version 24.0.7, build afdd53b4e3

python -V
Python 3.11.5

node -v
v18.18.0

npm i -g yarn
yarn -v
3.5.1

then:

source venv/bin/activate
(or)
conda activate myenv

(install Python and Node development dependencies)

make install

(login AWS SSO session)

aws sso login --profile dev && export AWS_PROFILE=dev

make up
make ps

(wait all services to be fully started)

make test_ica_mock
make test_localstack
make start

REST API at: http://localhost:8000
Swagger at: http://localhost:8000/swagger/
ReDoc at: http://localhost:8000/redoc/
Look into Makefile for more dev routine targets

Loading Data

You can sync a dev db dump from S3 bucket as follows:

make syncdata

Then, you can drop db and restore from db dump as follows:

make loaddata

Testing

Suite

Run test suite

make test

Unit test

Run individual test case, e.g.

python manage.py test data_portal.models.tests.test_s3object.S3ObjectTests.test_unique_hash

Coverage Report

Test coverage report can be generated locally as follows.

make coverage report

View coverage report at: htmlcov/index.html

open -a Safari htmlcov/index.html

Git

Pre-commit Hook

NOTE: We use pre-commit. It will guard and enforce static code analysis such as lint and any security audit via pre-commit hook. You are encouraged to fix those. If you wish to skip this for good reason, you can by-pass Git pre-commit hooks by using git commit --no-verify flag.

git config --unset core.hooksPath
pre-commit install
pre-commit run --all-files

SecOps

make check
make scan
make deep scan

GitOps

NOTE: We use GitOps and, release to deployment environments are all tracked by long-running Git branches as follows.

The default branch is dev. Any merges are CI/CD to DEV account environment.
The staging branch is stg. Any merges are CI/CD to STG account environment.
The main branch is production. Any merges are CI/CD to PROD account environment.

Git Flow

Typically, make your feature branch out from dev to work on your story point. Then please submit PR to dev.
Upon finalising release, create PR using GitHub UI from dev to stg or; from stg to main accordingly.
Merge to stg should be fast-forward merge from dev to maintain sync and linearity as follows:

git checkout stg
git merge --ff-only dev
git push origin stg

Merge to main should be fast-forward merge from stg to maintain sync and linearity as follows:

git checkout main
git merge --ff-only stg
git push origin main

See docs/PORTAL_RELEASE.md

Portal Lambdas

aws sso login --profile dev
export AWS_PROFILE=dev
aws lambda invoke --function-name data-portal-api-dev-migrate output.json

Serverless

Above sections are good enough for up and running Portal backend for local development purpose. You can take on Serverless and Deployment sections below for, when you want to extend some aspect of Portal backend REST API or lambda functions and, deploying of those features.

First, install serverless CLI and its plugins dependencies:

yarn install
npx serverless --version

You can serverless invoke or deploy from local. However, we favour CodeBuild pipeline for deploying into AWS dev/prod account environment.
Serverless deployment targets only to AWS. AWS account specific variables will be loaded from SSM Parameter Store of respective login session:

aws sso login --profile dev && export AWS_PROFILE=dev

npx serverless info --stage dev
npx serverless invoke -f migrate --stage dev
npx serverless invoke -f lims_scheduled_update_processor --stage dev
npx serverless deploy --stage dev --debug='*' --verbose

Deployment

A FRESH deployment has to first done with Terraform Data Portal stack, as IaC for longer-live infrastructure artifacts/services deployment.
Then, this API (shorter-live, a more repetitive backend stack) is provisioned by the Serverless framework (see serverless.yml), within AWS CodeBuild and CodePipeline CI/CD build setup (see buildspec.yml) -- whereas AWS specific environment variables originated from Terraform > CodeBuild > Serverless, if any.

Destroy

Before tear down Terraform stack, it is required to run serverless remove to remove Lambda, API Gateway, API domain, ... resources created by this serverless stack.
Example as follows:

aws sso login --profile dev && export AWS_PROFILE=dev

npx serverless delete_domain --stage dev
npx serverless remove --stage dev

X-Ray

X-Ray SDK is disabled by default!

Portal API backend and data processing functions can be traced with X-Ray instrumentation.
You can enable X-Ray SDK by setting the Lambda Configuration Environment variable in each Portal Lambda function, e.g.

  ...
  ...
  "Environment": {
    "Variables": {
      "DJANGO_SETTINGS_MODULE": "data_portal.settings.aws",
      "AWS_XRAY_SDK_ENABLED": "true"
    }
  },
  "TracingConfig": {
    "Mode": "Active"
  },
  ...
  ...

You can observe deployed Lambda functions as follows:

aws sso login --profile dev && export AWS_PROFILE=dev

aws lambda list-functions --query 'Functions[?starts_with(FunctionName, `data-portal-api`) == `true`].FunctionName'

aws lambda list-functions | jq '.Functions[] | select(.FunctionName == "data-portal-api-dev-sqs_s3_event_processor")'

You can then use AWS Lambda Console to enable AWS_XRAY_SDK_ENABLED to true.
While at AWS Lambda Console, you must also turn on: Configuration > Monitoring and operations tools > Active tracing.
Then make few Lambda invocations, and you can use AWS X-Ray Console > Traces to start observe tracing.
Please switch off the setting back when no longer in use.

Segments

By default, X-Ray SDK support auto instrumentation (i.e. auto created segments) for Django framework, including database queries, rendering subsegments, etc.
You can however acquire the current segment elsewhere in program code as follows:

segment = xray_recorder.current_segment()

Or you can add subsegment to start trace from your Lambda handler entrypoint:

from aws_xray_sdk.core import xray_recorder

def lambda_handler(event, context):
    # ... some code

    subsegment = xray_recorder.begin_subsegment('subsegment_name')
    # Code to record
    # Add metadata or annotation here, if necessary
    subsegment.put_metadata('key', dict, 'namespace')
    subsegment.put_annotation('key', 'value')

    xray_recorder.end_subsegment()

    # ... some other code

Refer the following links for example and doc:
- https://docs.aws.amazon.com/xray-sdk-for-python/latest/reference/index.html
- https://docs.aws.amazon.com/lambda/latest/dg/python-tracing.html

Name		Name	Last commit message	Last commit date
Latest commit History 1,031 Commits
.yarn		.yarn
data_portal		data_portal
data_processors		data_processors
docs		docs
swagger		swagger
.coveragerc		.coveragerc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
.yarnrc.yml		.yarnrc.yml
Makefile		Makefile
README.md		README.md
buildspec.yml		buildspec.yml
docker-compose.ci.yml		docker-compose.ci.yml
docker-compose.override.sample.yml		docker-compose.override.sample.yml
docker-compose.yml		docker-compose.yml
haproxy.cfg		haproxy.cfg
loaddata.sh		loaddata.sh
manage.py		manage.py
migrate.py		migrate.py
package.json		package.json
requirements-dev.txt		requirements-dev.txt
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
serverless.yml		serverless.yml
slimpatterns.yml		slimpatterns.yml
yarn.lock		yarn.lock

umccr/data-portal-apis

Folders and files

Latest commit

History

Repository files navigation

UMCCR Data Portal Backend API

Local Development

TL;DR

Loading Data

Testing

Suite

Unit test

Coverage Report

Git

Pre-commit Hook

SecOps

GitOps

Git Flow

Portal Lambdas

Serverless

Deployment

Destroy

X-Ray

Segments

About

Resources

Stars

Watchers

Forks

Languages