Simplified Azure ML Starters

These starters are designed to help you understand what options are available and allow you to quickly edit the scripts for faster deployments of your own.

Examples and resource links covered:

Quickstart with a local model
Hyperparameter Sweeps
Parallel Training on Many Files
Custom Containers
Distributed Training

Certain data sources are important to connect to but require using other python packages with no built-in connector.

Connect to Snowflake
Connect to Azure Synapse

Upload / Register a Model and Use it in Inference

To quickly get started, you can upload and register a model and then run a pipeline with a dataset.

cd upload-score
az ml model create -f ./register-model.yml

To run the scoring pipeline using this new model, create some data to be uploaded during pipeline submission.

python ./quickdata.py
az ml job create --file ./main.yml

Additional Reading

How to manage models

Hyperparmeter Sweeps

Hyperparameter sweeps take a training script and command

Use Hyperparameter Sweeps in a pipeline
Use Hyperparameters Sweeps in a job

Follow these steps to launch the sample:

Upload the bank marketing dataset below.

Execute this CLI command

az ml job create --file ./hyperparameter/job-hyperparam-opt.yml

Parallel Training on Many Files

You can use a parallel task on an Azure ML pipeline to train on micro batches of a large file or across many individual files.

Parallel Job Overview Microsoft Docs
CLI V2 Yaml Schema for Parallel Tasks
Set up the Online Retail 2 dataset

Execute the command below:

az ml job create --file ./parallel-pipeline/main.yaml

Registered Components

You can register components and re-use them.

Registered Components Microsoft Docs
Set up the Online Retail 2 dataset

Execute the command below to register a couple components you might re-use

az ml component create --file ./registered-components/component-lag/lagger.yml
az ml component create --file ./registered-components/component-dense-dates/densedate.yml

If you are running these commands multiple times you may need to add the --version flag.

Execute the command below to run the pipeline using local and registered components
```
az ml job create --file ./registered-components/main.yml
```

Custom Containers

Deploy a Custom Container on Azure ML
- The container needs to have the following:
- A web server that has a liveness, readiness, and scoring route.
- It must reference some model object since model is required (this example uploads a random serialized python function)
Getting Started with Azure Container Registry

Building a Docker Container and Deploying to an Online Endpoint

if (-Not (Test-Path -PathType Container .\customcontainer\pyobject)){
    New-Item -ItemType Directory -Force -Path .\customcontainer\pyobject
}
python -c "import joblib;joblib.dump(123, './customcontainer/pyobject/model.joblib');" 
Get-Content .\Dockerfile | docker build - -t USERNAME/IMAGENAME:v1
# Confirm it's working locally
docker run --rm -d -p 8080:8080 --name="custom-test" USERNAME/IMAGENAME:v1

# Tag the local docker image to include your azurecr.io domain
docker tag USERNAME/IMAGENAME:v1 YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1
# Login to Azure and your registry
az login
az acr login --name YOURACRNAME
# Push the 
docker push YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1

Next run this command using the azure CLI

DEPLOYMENT_EXISTS=$(az ml online-deployment list --endpoint-name mycustom-endpoint | jq -r '.[].name' | grep "^custom-deployment$")
if [ -z ${DEPLOYMENT_EXISTS} ]; 
then
    az ml online-deployment update -f custom-deployment.yml --name devops-deploy --set environment.image=YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1 --all-traffic
else
    az ml online-deployment create -f custom-deployment.yml --name devops-deploy --set environment.image=YOURACRNAME.azurecr.io/USERNAME/IMAGENAME:v1
    az ml online-endpoint update --name mycustom-endpoint --traffic "devops-deploy=100"
fi

sleep 10

Distributed Training

In the command task, you define the distribution.

Sample using Tensorflow and the MPI Distribution

Data Sources:

The datasets are registered via the CLI (official docs).

You should configure your defaults for the Azure CLI.

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Bank Marketing

(Bank Marketing @ UCI)
- Used in Hyperparameter Sweep

if (-Not (Test-Path -PathType Container .\downloads)){
    New-Item -ItemType Directory -Force -Path .\downloads
}
$source = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip'
$destination = '.\downloads\bank.zip'
Invoke-RestMethod -Uri $source -OutFile $destination
Expand-Archive .\downloads\bank.zip -DestinationPath .\datasets\bank
az ml data create -f ./datasets/bank-marketing.yml

Online Retail 2

Online Retail 2
- Used in parallel-pipeline
- Used in registered-components

if (-Not (Test-Path -PathType Container .\datasets)){
    New-Item -ItemType Directory -Force -Path .\datasets
}
if (-Not (Test-Path -PathType Container .\datasets\retail)){
    New-Item -ItemType Directory -Force -Path .\datasets\retail
}
$source = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00502/online_retail_II.xlsx'
$destination = '.\datasets\retail\online_retail_II.xlsx'
Invoke-RestMethod -Uri $source -OutFile $destination
az ml data create -f ./datasets/online-retail-ii.yml

Connect to Snowflake

Leverage the work of Mash Syed (mashhype) with an example that uses snowpark
It relies on the Snowflake Connector for Python

Connect to Azure Synapse

There is no native connector to Azure Synapse so you'll need to use Pyodbc / SQLAlchemy.

import pandas as pd
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy import text

server = 'yoursqlpoolname.sql.azuresynapse.net'
database = 'yourdatabase'
username = 'sqlusername'
password = 'XXX' # You might store this in a key vault and use the Key Vault Client library to get the secrets

# Querying a table with sqlalchemy
connection_url = sqlalchemy.engine.URL.create(
    "mssql+pyodbc",
    username=username,
    password=password,
    host=server,
    database=database,
    query={
        "driver": "ODBC Driver 17 for SQL Server",
        "autocommit": "True",
    },
)

engine = create_engine(connection_url).execution_options(
    isolation_level="AUTOCOMMIT"
)

# Reading from SQL
with engine.connect() as conn:
    df = pd.read_sql_table("table_name", conn)
    print(df.shape)

# Writing to SQL
df.to_sql(name="exampleWrite", con=engine, schema="dbo", if_exists="append")

Open Questions

What's the right way to get a model folder recognized as Job Output and not Data Output?
What's the right way to set up a file to be read in initially and then passed around (seems to be mode: rw_mount)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

customcontainer

customcontainer

datasets

datasets

hyperparameter

hyperparameter

parallel-pipeline

parallel-pipeline

registered-components

registered-components

upload-score

upload-score

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Simplified Azure ML Starters

Upload / Register a Model and Use it in Inference

Hyperparmeter Sweeps

Parallel Training on Many Files

Registered Components

Custom Containers

Building a Docker Container and Deploying to an Online Endpoint

Distributed Training

Data Sources:

Bank Marketing

Online Retail 2

Connect to Snowflake

Connect to Azure Synapse

Open Questions

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
customcontainer		customcontainer
datasets		datasets
hyperparameter		hyperparameter
parallel-pipeline		parallel-pipeline
registered-components		registered-components
upload-score		upload-score
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

wjohnson/AzureMLStarters

Folders and files

Latest commit

History

Repository files navigation

Simplified Azure ML Starters

Upload / Register a Model and Use it in Inference

Hyperparmeter Sweeps

Parallel Training on Many Files

Registered Components

Custom Containers

Building a Docker Container and Deploying to an Online Endpoint

Distributed Training

Data Sources:

Bank Marketing

Online Retail 2

Connect to Snowflake

Connect to Azure Synapse

Open Questions

About

Resources

License

Stars

Watchers

Forks

Languages