- GCP
- Docker
- Data and SQL (done)
- Terraform
- Exercises - SQL queries in dockerized Postgres environment
Goal: Orchestrating a job to ingest web data to a Data Lake in its raw form.
- Data Lake (GCS)
- Ingesting data to GCP with Airflow (URL.csv > GCP bucket > BigQuery
- Ingesting data to local Postgres with Airflow
- Exercises - Prepare data for next week
Goal: Structuring data into a Data Warehouse
- Partitoning and clustering, Automatic re-clustering
- Misc: BQ Geo location, BQ ML
- Exercises
- Run SQL queries in BigQuery
- Optimize tables by partitioning & clustering
Goal: Transforming Data in DWH to Analytical Views
Goal:
Goal:
Putting everything we learned to practice