Automation and Orchestration

Most big data solutions consist of repeated data processing operations, encapsulated in workflows. A pipeline orchestrator is a tool that helps to automate these workflows. An orchestrator can schedule jobs, execute workflows, and coordinate dependencies among tasks.

Automation in this context is the act of taking a manual process, such as managing a cluster through the Azure Databricks UI, and making it repeatable and configurable with parameters through scripting.

Pre-requisites

The sub-topics below require you to use a generated Personal Access Token for authentication. Follow the steps in Setup to create one if needed.

Sub-topics

Automation
- REST API
- Databricks CLI
Orchestration
- Azure Data Factory
- Apache Airflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overview.md

overview.md

Automation and Orchestration

Pre-requisites

Sub-topics

Files

overview.md

Latest commit

History

overview.md

File metadata and controls

Automation and Orchestration

Pre-requisites

Sub-topics