These starters are designed to help you understand what options are available and allow you to quickly edit the scripts for faster deployments of your own.
Examples and resource links covered:
- Quickstart with a local model
- Hyperparameter Sweeps
- Parallel Training on Many Files
- Custom Containers
- Distributed Training
Certain data sources are important to connect to but require using other python packages with no built-in connector.
To quickly get started, you can upload and register a model and then run a pipeline with a dataset.
cd upload-score
az ml model create -f ./register-model.yml
To run the scoring pipeline using this new model, create some data to be uploaded during pipeline submission.
python ./quickdata.py
az ml job create --file ./main.yml
Additional Reading
Hyperparameter sweeps take a training script and command
Follow these steps to launch the sample:
- Upload the bank marketing dataset below.
- Execute this CLI command
az ml job create --file ./hyperparameter/job-hyperparam-opt.yml
You can use a parallel task on an Azure ML pipeline to train on micro batches of a large file or across many individual files.
-
Set up the Online Retail 2 dataset
-
Execute the command below:
az ml job create --file ./parallel-pipeline/main.yaml
You can register components and re-use them.
-
Set up the Online Retail 2 dataset
-
Execute the command below to register a couple components you might re-use
az ml component create --file ./registered-components/component-lag/lagger.yml az ml component create --file ./registered-components/component-dense-dates/densedate.yml
- If you are running these commands multiple times you may need to add the
--version
flag.
- If you are running these commands multiple times you may need to add the
-
Execute the command below to run the pipeline using local and registered components
az ml job create --file ./registered-components/main.yml
- Deploy a Custom Container on Azure ML
- The container needs to have the following:
- A web server that has a liveness, readiness, and scoring route.
- It must reference some model object since model is required (this example uploads a random serialized python function)
- Getting Started with Azure Container Registry
if (-Not (Test-Path -PathType Container .\customcontainer\pyobject)){
New-Item -ItemType Directory -Force -Path .\customcontainer\pyobject
}
python -c "import joblib;joblib.dump(123, './customcontainer/pyobject/model.joblib');"
Get-Content .\Dockerfile | docker build - -t USERNAME/IMAGENAME:v1
# Confirm it's working locally
docker run --rm -d -p 8080:8080 --name="custom-test" USERNAME/IMAGENAME:v1
# Tag the local docker image to include your azurecr.io domain
docker tag USERNAME/IMAGENAME:v1 YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1
# Login to Azure and your registry
az login
az acr login --name YOURACRNAME
# Push the
docker push YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1
Next run this command using the azure CLI
DEPLOYMENT_EXISTS=$(az ml online-deployment list --endpoint-name mycustom-endpoint | jq -r '.[].name' | grep "^custom-deployment$")
if [ -z ${DEPLOYMENT_EXISTS} ];
then
az ml online-deployment update -f custom-deployment.yml --name devops-deploy --set environment.image=YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1 --all-traffic
else
az ml online-deployment create -f custom-deployment.yml --name devops-deploy --set environment.image=YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1
az ml online-endpoint update --name mycustom-endpoint --traffic "devops-deploy=100"
fi
sleep 10
In the command task, you define the distribution.
- Sample using Tensorflow and the MPI Distribution
The datasets are registered via the CLI (official docs).
You should configure your defaults for the Azure CLI.
az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>
- (Bank Marketing @ UCI)
- Used in Hyperparameter Sweep
if (-Not (Test-Path -PathType Container .\downloads)){
New-Item -ItemType Directory -Force -Path .\downloads
}
$source = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip'
$destination = '.\downloads\bank.zip'
Invoke-RestMethod -Uri $source -OutFile $destination
Expand-Archive .\downloads\bank.zip -DestinationPath .\datasets\bank
az ml data create -f ./datasets/bank-marketing.yml
- Online Retail 2
- Used in parallel-pipeline
- Used in registered-components
if (-Not (Test-Path -PathType Container .\datasets)){
New-Item -ItemType Directory -Force -Path .\datasets
}
if (-Not (Test-Path -PathType Container .\datasets\retail)){
New-Item -ItemType Directory -Force -Path .\datasets\retail
}
$source = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00502/online_retail_II.xlsx'
$destination = '.\datasets\retail\online_retail_II.xlsx'
Invoke-RestMethod -Uri $source -OutFile $destination
az ml data create -f ./datasets/online-retail-ii.yml
- Leverage the work of Mash Syed (mashhype) with an example that uses snowpark
- It relies on the Snowflake Connector for Python
There is no native connector to Azure Synapse so you'll need to use Pyodbc / SQLAlchemy.
import pandas as pd
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy import text
server = 'yoursqlpoolname.sql.azuresynapse.net'
database = 'yourdatabase'
username = 'sqlusername'
password = 'XXX' # You might store this in a key vault and use the Key Vault Client library to get the secrets
# Querying a table with sqlalchemy
connection_url = sqlalchemy.engine.URL.create(
"mssql+pyodbc",
username=username,
password=password,
host=server,
database=database,
query={
"driver": "ODBC Driver 17 for SQL Server",
"autocommit": "True",
},
)
engine = create_engine(connection_url).execution_options(
isolation_level="AUTOCOMMIT"
)
# Reading from SQL
with engine.connect() as conn:
df = pd.read_sql_table("table_name", conn)
print(df.shape)
# Writing to SQL
df.to_sql(name="exampleWrite", con=engine, schema="dbo", if_exists="append")
- What's the right way to get a model folder recognized as Job Output and not Data Output?
- What's the right way to set up a file to be read in initially and then passed around (seems to be mode: rw_mount)