Skip to content

πŸŒ₯️ The GitOps Platform for Data Analytics utilizes Kubernetes (K8s) and Terraform IaC on the AWS Cloud, offering speed, scalability, agility, and cost efficiency. ⚑

nnthanh101/Data-Platform-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

The GitOps Platform for Data Analytics on Kubernetes πŸš€

🎯 The GitOps Platform for Data Analytics utilizes Kubernetes (K8s) and HashiCorp's Terraform Infrastructure as Code (IaC) on the AWS Cloud πŸŒ₯️, offering speed, scalability, agility, and cost efficiency. ⚑

Build, Scale, and Optimize Data & AI/ML Platforms on K8s

πŸ—οΈ Architecture

The diagram below showcases the wide array of open-source data tools, Kubernetes operators, and frameworks supported by DoK8s. It also highlights the seamless integration of Data Analytics managed services with the powerful capabilities of DoK8s open-source tools: reusable, composable, configurable.

image

🌟 Features

Data on K8s (DoK8s) solution is categorized into the following focus areas.

πŸƒβ€β™€οΈ Deliverables

  • πŸš€ Reproducible Local Development with Dev Containers: VSCode, K8s, TF, Python/R
  • πŸš€ JupyterHub on EKS πŸ‘ˆ This blueprint deploys a self-managed JupyterHub on EKS with Amazon Cognito authentication.
  • πŸš€ Spark Operator with Apache YuniKorn on EKS πŸ‘ˆ This blueprint deploys EKS cluster and uses Spark Operator and Apache YuniKorn for running self-managed Spark jobs
  • πŸš€ Self-managed Airflow on EKS πŸ‘ˆ This blueprint sets up a self-managed Apache Airflow on an Amazon EKS cluster, following best practices.
  • πŸš€ Argo Workflows on EKS πŸ‘ˆ This blueprint sets up a self-managed Argo Workflow on an Amazon EKS cluster, following best practices.
  • πŸš€ Kafka on EKS πŸ‘ˆ This blueprint deploys a self-managed Kafka on EKS using the popular Strimzi Kafka operator.

Built with ❀️ at AWS πŸŒ₯️ K8s 🌟 Terraform πŸš€.

About

πŸŒ₯️ The GitOps Platform for Data Analytics utilizes Kubernetes (K8s) and Terraform IaC on the AWS Cloud, offering speed, scalability, agility, and cost efficiency. ⚑

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published