Skip to content

uanve/data-engineering-zoomcamp

Repository files navigation

DATA ENGINEERING ZOOMCAMP

github of the course

  • GCP
  • Docker
  • Data and SQL (done)
  • Terraform
  • Exercises - SQL queries in dockerized Postgres environment

Week 2: Data ingestion

Goal: Orchestrating a job to ingest web data to a Data Lake in its raw form.

  • Data Lake (GCS)
  • Ingesting data to GCP with Airflow (URL.csv > GCP bucket > BigQuery
  • Ingesting data to local Postgres with Airflow
  • Exercises - Prepare data for next week

Week 3: Data Warehouse

Goal: Structuring data into a Data Warehouse

  • Partitoning and clustering, Automatic re-clustering
  • Misc: BQ Geo location, BQ ML
  • Exercises
    • Run SQL queries in BigQuery
    • Optimize tables by partitioning & clustering

Week 4: Analytics engineering

Goal: Transforming Data in DWH to Analytical Views

Week 5: Batch processing

Goal:

Week 6: Streaming

Goal:

Week 7, 8 & 9: Project

Putting everything we learned to practice

About

Ongoin course: Data warehousing (BigQuery) | Batch processing (Airflow, Spark) | Analytics engineering (DBT) | Stream processing (Kafka)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published