Skip to content

Latest commit

 

History

History
263 lines (142 loc) · 6.08 KB

README.md

File metadata and controls

263 lines (142 loc) · 6.08 KB
title has_children nav_order nav_exclude
Infrastructure Catalog
true
3
false

DataOps Infrastructure Catalog

The Infrastructure Catalog contains ready-to-deploy terraform modules for a variety of production data project use cases and POCs. For information about the technical building blocks used in these modules, please see the catalog components index.

Contents

  1. AWS Catalog

  2. Azure Catalog

    • (Coming soon)
  3. GCP Catalog

    • (Coming soon)

AWS Catalog

AWS Airflow

Overview

Airflow is an open source platform to programmatically author, schedule and monitor workflows. More information here: airflow.apache.org

Documentation


AWS Bastion-Host

Overview

The bastion-host module deploys an ECS-backed container which can be used to remotely test or develop using the native cloud environment.

Applicable use cases include:

  • Debugging network firewall and routing rules
  • Debugging components which can only be run from whitelisted IP ranges
  • Offloading heavy processing from the developer's local laptop
  • Mitigating network reliability issues when working from WiFi or home networks

Documentation


AWS Data-Lake

Overview

This data lake implementation creates three buckets, one each for data, logging, and metadata. The data lake also supports lambda functions which can trigger automatically when new content is added.

  • Designed to be used in combination with the aws/data-lake-users module.
  • To add SFTP protocol support, combine this module with the aws/sftp module.

Documentation


AWS Data-Lake-Users

Overview

Automates the management of users and groups in an S3 data lake.

  • Designed to be used in combination with the aws/data-lake module.

Documentation


AWS DBT

Overview

DBT (Data Built Tool) is a CI/CD and DevOps-friendly platform for automating data transformations. More info at www.getdbt.com.

Documentation


AWS Environment

Overview

The environment module sets up common infrastrcuture like VPCs and network subnets. The environment output from this module is designed to be passed easily to downstream modules, streamlining the reuse of these core components.

Documentation


AWS ML-Ops

Overview

This module automates MLOps tasks associated with training Machine Learning models.

The module leverages Step Functions and Lambda functions as needed. The state machine executes hyperparameter tuning, training, and deployments as needed. Deployment options supported are Sagemaker endpoints and/or batch inference.

Documentation


AWS MySQL

Overview

Deploys a MySQL server running on RDS.

  • NOTE: Requires AWS policy 'AmazonRDSFullAccess' on the terraform account

Documentation


AWS Postgres

Overview

Deploys a Postgres server running on RDS.

  • NOTE: Requires AWS policy 'AmazonRDSFullAccess' on the terraform account

Documentation


AWS Redshift

Overview

Redshift is an AWS database platform which applies MPP (Massively-Parallel-Processing) principles to big data workloads in the cloud.

Documentation


AWS SFTP

Overview

Automates the management of the AWS Transfer Service, which provides an SFTP interface on top of existing S3 storage resources.

  • Designed to be used in combination with the aws/data-lake and aws/sftp-users modules.

Documentation


AWS SFTP-Users

Overview

Automates the management of SFTP user accounts on the AWS Transfer Service. AWS Transfer Service provides an SFTP interface on top of existing S3 storage resources.

  • Designed to be used in combination with the aws/sftp module.

Documentation


AWS Singer-Taps

Overview

The Singer Taps platform is the open source stack which powers the Stitcher EL platform. For more information, see singer.io

Documentation


AWS Tableau-Server

Overview

This module securely deploys one or more Tableau Servers, which can then be used to host reports in production or POC environments. The module supports both Linux and Windows versions of the Tableau Server Software.

Documentation


Azure Catalog

(Coming soon)

GCP Catalog

(Coming soon)


NOTE: This documentation was auto-generated using terraform-docs. Please do not attempt to manually update this file.