This is a demo of using Sagemaker for automated model training and hyperparameter tuning, with Glue ETL and Step Functions orchestration.
- glue: Glue ETL script and dependencies
- lambda_functions: Source code of Lambda functions
- nodejs-streamer: Source code of the Streameer Client docker image
- notebooks: Example notebooks
- sagemaker: Sagemaker model training job and hyperparameter tuning job configuration templates
- step_function: State Machine definition
- terraform: Infrastructure as code and automated deploy scripts
- project_config.cfg: project level configurations
- requirements.txt: Python requirements for this project
- Activate Python environment and install dependencies for the project:
virtualenv --python=python3.8 .venv # this project requires Python 3.8
source .venv/bin/activate
pip install -r requirements.txt # install Python dependencies
-
Modify
project_config.cfg
if necessary. You may also want to modify the variables interraform/state-bucket/main.tf
(TODO: automate this). -
Create a Service Linked Role for ECS using CLI. You may already have this role created if you used ECS in the AWS account before.
aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com
- Deploy demo by running the
deploy.sh
script:
cd terraform
sh deploy.sh
When you deploy the demo for the first time, if you see the following exception, run deploy.sh
again. It's likely to be caused by a slight delay in resource creation.
AccessDeniedException: Neither the global service principal states.amazonaws.com, nor the regional one is authorized to assume the provided role.
- You need an IG API key to use the IG demo APIs. Follow this page to create a free demo account and generate the API key.
The demo uses AWS Secrets Manager to store the credentials - go to AWS Secrets Manager console, find the newly created secret, and put in the following three key-value pairs:
ig_identifier: [your IG API demo account username]
ig_password: [your IG API demo account password]
ig_api_key: [the API key]
- Start the Streamer Client (ECS task) by running
sh run_task.sh
in the terraform directory.
Some of the key configuration files are:
- Configuration of the Streameer Client: nodejs-streamer/modules/load_config.js (if you make changes to the client, you will have to build and deploy the docker image yourself)
- ECS task definition template:
terraform/service/template-data-collector-container-definition.json
, especially if you want to use your own docker image - The Glue job: you can either make changes to
glue/glue-etl.ipynb
(then export as glue-etl.py using jupyter magic) or directly toglue/glue-etl.py
, depending on your preference - Sagemaker model training job and hyperparameter tuning job configurations:
sagemaker/training_job_definition.json
andsagemaker/tuning_job_config.json
- State Machine definition:
step_function/state_machine_definition.json
(however, the training and tuning job definition parts will be overwritten by the two files above)
Do not modify other files, including .tfvars
, unless you know what you are doing.
cd terraform
sh destroy.sh