Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 974 Bytes

File metadata and controls

18 lines (12 loc) · 974 Bytes

Automation and Orchestration

Most big data solutions consist of repeated data processing operations, encapsulated in workflows. A pipeline orchestrator is a tool that helps to automate these workflows. An orchestrator can schedule jobs, execute workflows, and coordinate dependencies among tasks.

Automation in this context is the act of taking a manual process, such as managing a cluster through the Azure Databricks UI, and making it repeatable and configurable with parameters through scripting.

Pre-requisites

The sub-topics below require you to use a generated Personal Access Token for authentication. Follow the steps in Setup to create one if needed.

Sub-topics