Skip to content

syedhassaanahmed/azure-event-driven-data-pipeline

Repository files navigation

azure-event-driven-data-pipeline

Build Status

Problem

A large retailer with many source systems, wants a single source of truth of their data and be able to send updates to their consumers whenever this data is changed. They want to support an unpredictable load, with a max spike of 1500 req/sec.

Architecture

Deployment

Deploy to Azure

The entire deployment can be orchestrated using ARM template azuredeploy.json.

To deploy using Azure CLI;

az group deployment create -g <RESOURCE_GROUP> --template-file azuredeploy.json

Once the deployment is complete, the only manual step is to copy ConsumerReceiveFunc URL from the Azure portal and paste it multiple times (pipe | delimited) in ConsumerEgressFunc -> App Settings -> CONSUMERS.

Running load tests

We perform the load tests using Azure Container Instances. After creating resources using the above ARM template, run the following load testing script;

./generate-load.sh <RESOURCE_GROUP> <CONTAINER_NAME> https://http-ingress-func.azurewebsites.net/api/HttpIngressFunc?code=<FUNCTION_KEY>

Here is how to stream logs from the container;

az container attach -g <RESOURCE_GROUP> -n <CONTAINER_NAME>

Measuring Cosmos DB RUs using Application Insights

When we upsert into Cosmos DB, we log the Request Units consumed in Application Insights. The following Kusto query renders a timechart of RUs consumed in the last 10 minutes, aggregated on 10 seconds.

customMetrics
| where timestamp > ago(10m)
    and name == "product_RU"
| summarize avg(value) by bin(timestamp, 10s)
| render timechart

Resources

Choose between Azure services that deliver messages

Choose between Flow, Logic Apps, Functions, and WebJobs

Durable Functions overview

Understanding Serverless Cold Start

Azure Function Apps: Performance Considerations

Processing 100,000 Events Per Second on Azure Functions

Choose the right data store

Modeling document data for NoSQL databases

A fast, serverless, big data pipeline powered by a single Azure Function

Load testing with Azure Container Instances and wrk